@spanlens/sdk
v0.6.1
Published
Spanlens SDK — agent tracing, LLM usage capture, and cost observability for TypeScript.
Maintainers
Keywords
Readme
@spanlens/sdk
LLM observability SDK for Spanlens. Record agent traces, LLM calls, tool invocations, and retrievals with a single line change.
Zero-instrumentation mode. Just swap your baseURL to Spanlens proxy and you get request logging + cost tracking automatically. Use this SDK when you also want agent tracing (multi-step workflows, parallel fan-out, nested spans).
💡 Next.js user? Run
npx @spanlens/cli init. The wizard installs this SDK, writes your env var, and auto-rewritesnew OpenAI({...})intocreateOpenAI()for you (30 seconds).
Install
npm install @spanlens/sdk
# or
pnpm add @spanlens/sdk1-line setup (v0.2.0+) ⚡
For the common case where you just want to route your LLM calls through Spanlens for logging + cost tracking, use the pre-configured client helpers. No baseURL to remember:
// Before
import OpenAI from 'openai'
const openai = new OpenAI({
apiKey: process.env.SPANLENS_API_KEY,
baseURL: 'https://spanlens-server.vercel.app/proxy/openai/v1',
})
// After ⚡
import { createOpenAI } from '@spanlens/sdk/openai'
const openai = createOpenAI() // reads SPANLENS_API_KEY + baseURL automaticallyAll three providers supported:
import { createOpenAI } from '@spanlens/sdk/openai'
import { createAnthropic } from '@spanlens/sdk/anthropic'
import { createGemini } from '@spanlens/sdk/gemini'
const openai = createOpenAI()
const anthropic = createAnthropic()
const gemini = createGemini()
// gemini.getGenerativeModel() auto-routes through Spanlens proxyThe returned clients are identical to new OpenAI(...) etc, so all options (timeout, headers, organization, etc.) forward through. Peer dependencies (openai, @anthropic-ai/sdk, @google/generative-ai) are optional. Install only the ones you use.
Prompt A/B tagging (v0.2.2+)
Link a call to a specific Spanlens Prompts version so it shows up in the A/B metrics table:
import { createOpenAI, withPromptVersion } from '@spanlens/sdk/openai'
const openai = createOpenAI()
const res = await openai.chat.completions.create(
{ model: 'gpt-4o-mini', messages: [...] },
withPromptVersion('chatbot-system@3'), // or '@latest' / raw UUID
)Same helper on @spanlens/sdk/anthropic. For observeOpenAI/Anthropic/Gemini, pass promptVersion in options.
Per-user, per-session, and body-redaction tagging
The same headers-style helpers cover the other X-Spanlens-* headers. Available on both @spanlens/sdk/openai and @spanlens/sdk/anthropic.
import { createOpenAI, withUser, withSession, withLogBody } from '@spanlens/sdk/openai'
const openai = createOpenAI()
await openai.chat.completions.create(
{ model: 'gpt-4o-mini', messages: [...] },
{
headers: {
...withUser(currentUser.id).headers, // per-user analytics in /users
...withSession(sessionId).headers, // group calls into one session
...withLogBody('meta').headers, // 'full' | 'meta' | 'none' (body redaction level)
},
},
)withUser(id)tags the call so it shows up under that user in the /users page (cost, tokens, error rate, last seen).withSession(id)groups calls in the same chat / conversation so multi-turn flows are easy to inspect.withLogBody('meta')stores only metadata, not request / response bodies. Use'none'to also drop end-user IDs. Useful for HIPAA-style data minimization without dropping the request entirely.
For multi-step agent tracing (Gantt view, parent/child spans, RAG pipelines), continue to the Quick start below.
Quick start
import { SpanlensClient, observe } from '@spanlens/sdk'
const client = new SpanlensClient({ apiKey: process.env.SPANLENS_API_KEY! })
const trace = client.startTrace({
name: 'support_chat',
metadata: { user_id: 'u_42', session_id: 'sess_abc' },
})
try {
// Manual span
const retrievalSpan = trace.span({ name: 'kb_search', spanType: 'retrieval' })
const docs = await vectorStore.query('...')
await retrievalSpan.end({ output: { doc_count: docs.length } })
// Auto-end via observe helper (handles errors, always closes the span)
const answer = await observe(trace, { name: 'gpt4o_answer', spanType: 'llm' }, async (span) => {
const res = await openai.chat.completions.create({ ... })
span.end({
totalTokens: res.usage!.total_tokens,
costUsd: computeCost(res.usage!),
})
return res.choices[0].message.content
})
await trace.end({ status: 'completed' })
} catch (err) {
await trace.end({ status: 'error', errorMessage: String(err) })
throw err
}API
new SpanlensClient(config)
| Option | Type | Default | Description |
|---|---|---|---|
| apiKey | string | (required) | Spanlens API key (sl_live_...). |
| baseUrl | string | https://spanlens-server.vercel.app | API base URL. |
| timeoutMs | number | 3000 | Request timeout for ingest calls. |
| silent | boolean | true | Swallow network errors so instrumentation never crashes user code. |
| onError | (err, ctx) => void | (none) | Called on every ingest failure (even when silent). |
client.startTrace({ name, metadata? }) → TraceHandle
Starts a new trace. Returns immediately. The backend ingest POST runs in the background.
TraceHandle
.traceId: string. A client-generated UUID..span(options) → SpanHandlecreates a root span under this trace..end({ status?, errorMessage?, metadata? })marks the trace complete (idempotent).
SpanHandle
.spanId: string.child(options) → SpanHandlecreates a nested span (auto-setsparent_span_id)..end({ status?, output?, errorMessage?, promptTokens?, completionTokens?, totalTokens?, costUsd?, requestId?, metadata? })is idempotent.
spanType: 'llm' | 'tool' | 'retrieval' | 'embedding' | 'custom' (default 'custom').
observe(parent, options, fn)
Wraps an async function in a span. Auto-ends the span on success or failure (rethrows the error).
const result = await observe(traceOrSpan, { name: 'work' }, async (span) => {
// span is open here
return doWork()
// span automatically closes; .end() is idempotent so you can still
// call span.end({ totalTokens, costUsd }) inside to capture metrics.
})Framework examples
OpenAI (auto-instrumentation)
import OpenAI from 'openai'
import { SpanlensClient, observeOpenAI } from '@spanlens/sdk'
const spanlens = new SpanlensClient({ apiKey: process.env.SPANLENS_API_KEY! })
// Route OpenAI calls through the Spanlens proxy. The SDK injects
// x-trace-id/x-span-id headers so the proxy's request log is linked
// back to your spans.
const openai = new OpenAI({
apiKey: process.env.SPANLENS_API_KEY!,
baseURL: 'https://spanlens-server.vercel.app/proxy/openai/v1',
})
const trace = spanlens.startTrace({ name: 'support_chat' })
const res = await observeOpenAI(trace, 'answer', (headers) =>
openai.chat.completions.create(
{ model: 'gpt-4o-mini', messages: [{ role: 'user', content: 'Hi' }] },
{ headers },
),
)
await trace.end({ status: 'completed' })Anthropic
import Anthropic from '@anthropic-ai/sdk'
import { SpanlensClient, observeAnthropic } from '@spanlens/sdk'
const spanlens = new SpanlensClient({ apiKey: process.env.SPANLENS_API_KEY! })
const anthropic = new Anthropic({
apiKey: process.env.SPANLENS_API_KEY!,
baseURL: 'https://spanlens-server.vercel.app/proxy/anthropic',
})
const trace = spanlens.startTrace({ name: 'agent_run' })
const res = await observeAnthropic(trace, 'reason', (headers) =>
anthropic.messages.create(
{ model: 'claude-haiku-4-5', max_tokens: 1024, messages: [...] },
{ headers },
),
)
await trace.end()LangChain JS (v0.3.0+)
@spanlens/sdk/langchain ships a drop-in callback handler. Pass it to the
callbacks option of any LangChain chain, LLM, or RunnableConfig. No
proxy URL needed, no imports from @langchain/core required:
import { SpanlensClient } from '@spanlens/sdk'
import { createSpanlensCallbackHandler } from '@spanlens/sdk/langchain'
const client = new SpanlensClient({ apiKey: process.env.SPANLENS_API_KEY! })
const handler = createSpanlensCallbackHandler({ client })
// Works with any chain, LLM, or tool
const result = await chain.invoke({ input: 'Hello' }, { callbacks: [handler] })
// → prompt/completion tokens, model name, latency automatically recordedAttach to an existing trace to nest spans under your workflow:
const trace = client.startTrace({ name: 'my_workflow' })
const handler = createSpanlensCallbackHandler({ client, trace })
await chain.invoke({ input: '...' }, { callbacks: [handler] })
await someOtherStep()
await trace.end()Vercel AI SDK (v0.3.0+)
@spanlens/sdk/vercel-ai provides createSpanlensTracker() whose
onStepFinish and onFinish callbacks spread directly into generateText,
streamText, generateObject, and streamObject options:
import { generateText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { SpanlensClient } from '@spanlens/sdk'
import { createSpanlensTracker } from '@spanlens/sdk/vercel-ai'
const client = new SpanlensClient({ apiKey: process.env.SPANLENS_API_KEY! })
const tracker = createSpanlensTracker({ client, modelName: 'gpt-4o' })
const result = await generateText({
model: openai('gpt-4o'),
messages: [{ role: 'user', content: 'Hello!' }],
onStepFinish: tracker.onStepFinish, // optional (captures multi-step tool calls)
onFinish: tracker.onFinish, // required (records final usage)
})Attach to an existing trace:
const trace = client.startTrace({ name: 'ai_pipeline' })
const tracker = createSpanlensTracker({ client, trace, modelName: 'gpt-4o' })
await generateText({ ..., onFinish: tracker.onFinish })
await trace.end()Ollama (local LLMs)
observeOllama() traces calls against a local Ollama instance. Use the OpenAI client pointed at Ollama's OpenAI-compatible endpoint. The wrapper tags the span as provider: 'ollama' so the dashboard charts it separately:
import OpenAI from 'openai'
import { SpanlensClient, observeOllama } from '@spanlens/sdk'
const client = new SpanlensClient({ apiKey: process.env.SPANLENS_API_KEY! })
const ollama = new OpenAI({
baseURL: 'http://localhost:11434/v1',
apiKey: 'ollama', // ignored by Ollama; required by the openai SDK
})
const trace = client.startTrace({ name: 'local_summarize' })
const res = await observeOllama(trace, 'llama3_summary', () =>
ollama.chat.completions.create({
model: 'llama3.1',
messages: [{ role: 'user', content: 'Summarize: ...' }],
}),
)
await trace.end()LlamaIndex TS (v0.3.0+)
@spanlens/sdk/llamaindex hooks directly into LlamaIndex's
Settings.callbackManager, so every LLM call is automatically traced:
import { Settings } from 'llamaindex'
import { SpanlensClient } from '@spanlens/sdk'
import { registerSpanlensCallbacks } from '@spanlens/sdk/llamaindex'
const client = new SpanlensClient({ apiKey: process.env.SPANLENS_API_KEY! })
// Register once at app startup
const unregister = registerSpanlensCallbacks(Settings, { client })
// All subsequent LlamaIndex LLM calls are now traced automatically
const response = await queryEngine.query({ query: 'What is Spanlens?' })
// Clean up when done (e.g. in tests or on process exit)
unregister()Attach to an existing trace for RAG pipelines:
const trace = client.startTrace({ name: 'rag_query' })
const unregister = registerSpanlensCallbacks(Settings, { client, trace })
await queryEngine.query({ query: '...' })
unregister()
await trace.end()Graceful shutdown with client.flush()
Background ingest writes are fire-and-forget. In short-lived processes (scripts, one-shot jobs, serverless cold starts) the process may exit before all POSTs complete. Call flush() before exit to drain them:
const client = new SpanlensClient({ apiKey: process.env.SPANLENS_API_KEY! })
// ... your agent logic ...
await client.flush() // resolves when all in-flight ingest calls have settled
process.exit(0)flush() resolves even if some requests failed. It uses Promise.allSettled internally so a network error won't hang the process.
Design notes
- Fire-and-forget ingest:
startTrace()andtrace.span()return synchronously. Network writes run in the background so your hot path never waits on observability. - Retry with back-off: transient failures (network error, 429, 5xx) are retried up to 3 times with exponential back-off (200 ms → 400 ms → 800 ms). 4xx errors are not retried.
- Client-side UUIDs: idempotent retries are safe, since the same UUID twice is a no-op on the server.
- No unhandled rejections: background POST failures are silently swallowed; use the
onErrorhook for visibility.
License
MIT
