agentpulse-sdk
v0.3.1
Published
Lightweight observability SDK for AI agents — zero manual tracking
Maintainers
Readme
agenttrace
Lightweight observability SDK for AI agents (Node.js / TypeScript). Track runs, steps, tokens, cost, and latency — with zero blocking overhead.
const result = await trackRun("hotel-search", { model: "llama-3.3-70b" }, async (run) => {
const t0 = performance.now()
const resp = await groq.chat.completions.create({ ... })
run.addStep({
stepType: "llm_response",
input: messages.at(-1).content,
output: resp.choices[0].message.content,
tokens: resp.usage.total_tokens,
latency: Math.round(performance.now() - t0),
})
return resp.choices[0].message.content
})Completed runs are flushed to your AgentTrace dashboard in the background — your agent never waits on a network call.
Features
- Concurrent-safe — per-run objects on the call stack, no global mutable state
- Non-blocking — background
setIntervalworker + in-memory queue; the agent loop pays zero network latency - Reliable — exponential backoff retries with jitter (4 attempts by default)
- Batching — configurable batch size and flush interval
- Zero dependencies — Node built-ins only (
http,https) - ESM + CJS — ships both formats, works everywhere
Install
npm install agenttraceQuick start
Set your credentials in .env:
AGENTTRACE_API_KEY="at_xxxxxxxxxxxxxxxxxxxx"
AGENTTRACE_URL="https://your-dashboard.com" # default: http://localhost:3001
AGENTTRACE_SERVICE="my-agent" # default: agentimport "dotenv/config"
import { trackRun, flush } from "agenttrace"
await trackRun("my-agent", async (run) => {
run.addStep({ stepType: "llm_response", tokens: 120, latency: 350 })
})
// For scripts: wait for the queue to drain before process exits
await flush()Usage
trackRun — context callback
The primary API. Receives the run object so you can record steps inside.
import { trackRun } from "agenttrace"
import { performance } from "perf_hooks"
const answer = await trackRun(
"hotel-search",
{ model: "llama-3.3-70b", userId: "user_123", tags: ["prod"] },
async (run) => {
// Track an LLM call
const t0 = performance.now()
const resp = await groq.chat.completions.create({ model: "...", messages })
run.addStep({
stepType: "llm_response",
input: messages.at(-1).content,
output: resp.choices[0].message.content,
tokens: resp.usage.total_tokens,
latency: Math.round(performance.now() - t0),
})
// Track a tool call
const t1 = performance.now()
const results = await webSearch(query)
run.addStep({
stepType: "tool_call",
input: query,
output: results.slice(0, 2000),
latency: Math.round(performance.now() - t1),
})
return results
}
)Options argument is optional:
await trackRun("my-agent", async (run) => {
run.addStep({ stepType: "llm_response", tokens: 80 })
})tracedRun — decorator
Wraps a function — run is injected as the first argument:
import { tracedRun } from "agenttrace"
const searchAgent = tracedRun(
"search-agent",
async (run, query: string) => {
run.addStep({ stepType: "llm_response", tokens: 150, latency: 320 })
return answer
},
{ model: "llama-3.3-70b" } // RunOptions (optional)
)
// Call like a normal function — tracking is automatic
const result = await searchAgent("best hotels in Bangalore")Marking a run failed
trackRun catches unhandled exceptions automatically and marks the run failed. For explicit failure paths:
await trackRun("my-agent", async (run) => {
const data = await fetchData()
if (!data) {
run.fail("fetchData returned null")
return
}
run.addStep({ stepType: "tool_response", output: JSON.stringify(data) })
})Configuration
Environment variables
| Variable | Default | Description |
|---|---|---|
| AGENTTRACE_API_KEY | — | Required. Your API key (at_...) |
| AGENTTRACE_URL | http://localhost:3001 | Backend base URL |
| AGENTTRACE_SERVICE | agent | Default service name |
init() — programmatic config
Calling init() is optional when env vars are set. Use it to override defaults or tune the worker:
import { init } from "agenttrace"
init({
apiKey: "at_xxxxxxxxxxxxxxxxxxxx",
baseUrl: "https://your-dashboard.com",
serviceName: "hotel-search-service",
// Worker tuning (optional)
batchSize: 30, // flush after this many runs (default: 20)
flushInterval: 3_000, // flush every N ms (default: 2000)
maxQueueSize: 2_000, // drop with warning if queue exceeds this (default: 1000)
maxRetries: 5, // retry attempts per run (default: 4)
retryBaseDelay: 1_000, // base backoff ms, doubles each retry (default: 500)
})init() can be called multiple times (e.g. in tests) — it replaces the worker and config.
run.addStep() reference
run.addStep({
stepType: "llm_response", // required — see Step types below
input: "", // string — prompt / query / tool input
output: "", // string — completion / result / tool output
tokens: 0, // number — total tokens for this step
latency: 0, // number — wall-clock time in milliseconds
cost: 0, // number — USD cost for this step
status: "success", // "success" | "failed"
})Tokens and cost are summed automatically across all steps — you don't need to track totals yourself.
Step types
| stepType | When to use |
|---|---|
| "llm_response" | Any LLM completion call |
| "llm_prompt" | Prompt-only span (before response arrives) |
| "tool_call" | External tool / function call |
| "tool_response" | Response from a tool |
| "user_prompt" | Initial user message |
| "decision" | Routing / branching logic |
Flushing before exit
The background worker sends continuously in long-running processes (Express servers, queue workers). For short-lived scripts, await flush() before exit:
await runAgent(query)
await agenttrace.flush() // waits up to 10s for the queue to drainA beforeExit hook provides a best-effort flush as a safety net, but an explicit flush() is more reliable for scripts.
How it works
Agent code Background worker (setInterval)
────────────────── ─────────────────────────────────
trackRun().__enter
→ AgentRun created sleeping (flushInterval elapsed?)
run.addStep(...)
→ appended to run._steps
trackRun().__exit
→ run.toPayload() wakes up: batchSize reached or
→ queue.push(payload) flushInterval elapsed
batch = queue.splice(0, batchSize)
POST /agent-metrics (with retry)
POST /agent-metrics
...The agent never waits on the network. If the queue fills up (backend down), new runs are dropped with a warning rather than blocking.
Full example
import "dotenv/config"
import Groq from "groq-sdk"
import { tracedRun, flush } from "agenttrace"
import { performance } from "perf_hooks"
const groq = new Groq()
const hotelSearch = tracedRun(
"hotel-search-agent",
async (run, query: string) => {
const messages = [
{ role: "system" as const, content: "You are a hotel search assistant." },
{ role: "user" as const, content: query },
]
for (let i = 0; i < 10; i++) {
const t0 = performance.now()
const resp = await groq.chat.completions.create({
model: "llama-3.3-70b-versatile",
messages,
tools: [/* ... */],
tool_choice: "auto",
})
run.addStep({
stepType: "llm_response",
input: messages.at(-1)!.content,
output: resp.choices[0].message.content ?? "",
tokens: resp.usage?.total_tokens ?? 0,
latency: Math.round(performance.now() - t0),
})
const msg = resp.choices[0].message
if (!msg.tool_calls) return msg.content ?? ""
for (const tc of msg.tool_calls) {
const args = JSON.parse(tc.function.arguments)
const t1 = performance.now()
const result = await myTool(args)
run.addStep({
stepType: "tool_call",
input: JSON.stringify(args),
output: String(result).slice(0, 2000),
latency: Math.round(performance.now() - t1),
})
messages.push({ role: "tool" as const, content: String(result) })
}
}
run.fail("Max iterations reached")
return ""
},
{ model: "llama-3.3-70b-versatile", tags: ["hotel", "search"] }
)
const answer = await hotelSearch("best hotels in Bangalore")
console.log(answer)
await flush()License
MIT
