@mimir-observe/sdk

v0.3.2

Published

24 days ago

Auto-instrumentation and visibility for AI agents (TypeScript SDK)

0High
0Medium
0Low

ta-khush

@mimir-observe/sdk

Auto-instrumentation and visibility for AI agents. Two lines of code, zero config.

npm install @mimir-observe/sdk

The TypeScript / Node counterpart to mimir-observe on PyPI. Same wire format, same dashboard.

Quick start

1. Install the SDK plus whichever LLM SDK your agent uses

npm install @mimir-observe/sdk
# pick whichever you use (or both):
npm install openai
npm install @anthropic-ai/sdk

openai and @anthropic-ai/sdk are optional peer dependencies — install only the ones you actually call. The unused adapters won't error.

2. Add instrumentation (1 line per provider)

Put this at the top of your entry point, before any client is constructed:

import { instrumentOpenAI } from '@mimir-observe/sdk/adapters/openai'
instrumentOpenAI()

import { instrumentAnthropic } from '@mimir-observe/sdk/adapters/anthropic'
instrumentAnthropic()

Your existing client.chat.completions.create(...) / client.messages.create(...) calls now get captured automatically.

Running under tsx, ts-node, jest, or a bundler? Pass your Anthropic class. These runtimes load modules through a custom loader, so when the adapter re-resolves @anthropic-ai/sdk on its own it gets a different class object than your import — the patch lands on a class nobody uses and nothing is traced. Hand the adapter the exact class instead:
import Anthropic from '@anthropic-ai/sdk'
import { instrumentAnthropic } from '@mimir-observe/sdk/adapters/anthropic'
instrumentAnthropic(Anthropic)
You can also pass a client instance or the imported module. Under plain Node, the no-argument instrumentAnthropic() works fine.

ESM only. The SDK is published as ESM. Your project needs "type": "module" in package.json (or .mts files) to import it directly. CJS users: load it once via const { instrumentOpenAI } = await import('@mimir-observe/sdk/adapters/openai') inside an async startup function.

3. Configure the API key

# .env
MIMIR_API_KEY=mimir_xxxxxxxxxxxxxxxxxxxxxxxx
# Optional — only set if pointing at a self-hosted Mimir or staging:
# MIMIR_API_URL=https://api.mimir.sh

The SDK reads these from process.env lazily on the first API call. You can also call configure({ apiKey, apiUrl }) programmatically.

4. There is no step 4

Open the dashboard at https://app.mimir.sh. Every API call and run is captured.

Which adapter do I use?

| Your code | Adapter | |---|---| | import OpenAI from 'openai' | @mimir-observe/sdk/adapters/openai | | import Anthropic from '@anthropic-ai/sdk' | @mimir-observe/sdk/adapters/anthropic |

You can call multiple instrumentX() if your project uses more than one provider.

Vercel AI SDK and LangChain adapters are planned — not yet shipped. Use manual instrumentation in the meantime.

Multi-turn agentic loops

If your agent calls the model multiple times in a loop, wrap it with trace() so all turns become steps in one named run instead of N separate runs:

import Anthropic from '@anthropic-ai/sdk'
import { trace } from '@mimir-observe/sdk'
import { instrumentAnthropic } from '@mimir-observe/sdk/adapters/anthropic'

instrumentAnthropic()
const client = new Anthropic()

await trace('Migration Planner', async (run) => {
  // Every messages.create() inside here becomes a step in one run.
  let response = await client.messages.create({ ... })
  while (response.stop_reason === 'tool_use') {
    response = await client.messages.create({ ... })
  }
  run.setOutput('done')
})

Each distinct agent should get its own trace('Agent Name') with a unique name. Single API calls outside a loop don't need this — they auto-create runs.

Streaming (Anthropic)

client.messages.stream(...) is auto-instrumented — it produces the same steps as the equivalent non-streaming client.messages.create(...). Listen or iterate as usual; Mimir captures telemetry from the stream's events without consuming it:

const stream = client.messages.stream({ model: 'claude-sonnet-4-20250514', messages, max_tokens: 1024 })
for await (const event of stream) {
  // your UI streaming, untouched
}
const message = await stream.finalMessage()

Works the same inside a trace() block (steps append to that run) or outside one (a standalone run opens at stream start and closes when the stream ends, errors, or aborts).

Use messages.stream(...), not messages.create({ stream: true }). The latter returns a low-level chunk iterator that Mimir passes through untouched — it is not instrumented. messages.stream(...) is the supported streaming entry point.

Upgrading? If you previously hand-rolled telemetry around a streaming call — manually calling run.llmCall(...), run.tool(...), run.reasoning(...), or run.setUsage(...) inside a trace() because streaming wasn't auto-instrumented — remove those calls. The adapter now records them automatically; keeping both produces duplicate steps.

Auto-instrumentation + manual tool results

Auto-instrumentation taps the raw Anthropic API, where the response carries the tool_use request but not its result — the result only arrives in your next request. So auto-recorded tool steps always have a null result.

If you execute tools yourself and have the real results, record them with run.tool(name, args, result) and call run.suppressAutoTools() once so the adapter stops emitting its own null-result tool steps. You keep automatic llm_call and reasoning capture — only tool steps become yours:

await trace('Research Agent', async (run) => {
  run.suppressAutoTools() // adapter skips its null-result tool steps

  let response = await client.messages.create({ ... })
  while (response.stop_reason === 'tool_use') {
    const call = response.content.find((b) => b.type === 'tool_use')
    const result = await runTool(call.name, call.input) // your execution
    run.tool(call.name, call.input, result) // real result, recorded once
    response = await client.messages.create({ ... })
  }
})

Without suppressAutoTools(), every tool appears twice: once from the adapter (null result) and once from your run.tool(...) call.

Manual instrumentation

For custom setups where auto-instrumentation doesn't fit:

import { task } from '@mimir-observe/sdk'

const agent = task('SecurityAuditor', JSON.stringify({ role: 'auditor' }), {
  model: 'claude-sonnet-4-20250514',
  tools: ['file_read', 'grep'],
})

await agent.run({ prompt: 'Audit the auth module' }, async (run) => {
  run.tool('grep', { pattern: 'jwt.verify' }, 'found 3 matches', 50)
  run.reasoning('Three call sites — checking each...')
  run.llmCall({ model: 'claude-sonnet-4-20250514', inputTokens: 1500, outputTokens: 800 })
  run.setOutput({ findings: 12, critical: 2 })
})

How it works

instrumentX() discovers the relevant class from the installed SDK at runtime and monkey-patches Completions.prototype.create (OpenAI) or Messages.prototype.create and Messages.prototype.stream (Anthropic). Every subsequent call is intercepted, telemetry is extracted from the request/response, and shipped to Mimir via fire-and-forget fetch. Your agent is never blocked.

Active-run context propagates across await boundaries via AsyncLocalStorage — the Node equivalent of Python's threading.local. This is what lets an adapter recognise "I'm inside a trace() block, append a step to that run" vs "no active run, start a new one".

No required runtime deps — uses only node:async_hooks, node:crypto, and global fetch
Dashboard down? Agent runs normally, no errors thrown
Uninstrument any time: uninstrumentOpenAI(), uninstrumentAnthropic()
ESM-only, Node ≥20.19

Onboarding with Claude Code

If you use Claude Code, paste this prompt to have it instrument your project automatically:

Install and set up Mimir agent observability in this Node project.

Step 1: Install dependencies.
  npm install @mimir-observe/sdk
  Plus whichever LLM SDK the project actually uses (check the imports):
    - npm install openai
    - npm install @anthropic-ai/sdk

Step 2: Find the entry point(s) and determine which SDK is used:
  - `import OpenAI from 'openai'`       → instrumentOpenAI
  - `import Anthropic from '@anthropic-ai/sdk'` → instrumentAnthropic

  Add at the top of each entry point, BEFORE any client is constructed:

    import { instrumentOpenAI } from '@mimir-observe/sdk/adapters/openai'
    instrumentOpenAI()

  Or for Anthropic:

    import { instrumentAnthropic } from '@mimir-observe/sdk/adapters/anthropic'
    instrumentAnthropic()

  IMPORTANT — if the project runs under tsx, ts-node, jest, or a bundler,
  pass the Anthropic class explicitly, or NOTHING is traced (silent, no error):

    import Anthropic from '@anthropic-ai/sdk'
    import { instrumentAnthropic } from '@mimir-observe/sdk/adapters/anthropic'
    instrumentAnthropic(Anthropic)

  Notes:
  - The SDK is ESM. The project needs "type": "module" in package.json
    to import it directly. For CommonJS projects, load it once via
    dynamic import in an async startup function:
      const { instrumentOpenAI } = await import('@mimir-observe/sdk/adapters/openai')
      instrumentOpenAI()

Step 3: If the code has multi-turn agentic loops (calling the API multiple
times in a while/for loop), wrap each agent's loop with trace() so all
turns become steps in one run instead of separate runs:

    import { trace } from '@mimir-observe/sdk'

    await trace('Migration Planner', async (run) => {
      // ... the existing loop goes here, unchanged ...
    })

  Each distinct agent should get its own trace() with a unique name.
  Single API calls outside a loop do NOT need this wrapper.

Step 4: Configure the API key. Get one at https://app.mimir.sh and add to .env:
    MIMIR_API_KEY=mimir_xxxxxxxx
    # Optional override for self-hosted/staging:
    # MIMIR_API_URL=https://api.mimir.sh
  The SDK reads from process.env at the first API call. If MIMIR_API_KEY
  is missing the SDK prints a one-time stderr warning and traces are
  dropped server-side (the agent keeps running — POSTs are fire-and-forget).

Step 5: Run the agent. Traces appear at https://app.mimir.sh.

Development

# typecheck
npx turbo typecheck --filter=@mimir-observe/sdk

# integration tests — hits the real apps/api dev server + real Supabase
# (no mocking; see docs/decisions/0006-vitest-integration-tests.md)
cp .env.test.example .env.test  # fill in SUPABASE_SERVICE_ROLE_KEY
supabase db reset                # seeds the test API key
npm test --workspace=@mimir-observe/sdk

# build the publishable artifact
npm run build --workspace=@mimir-observe/sdk

See PUBLISHING.md for the release flow.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@mimir-observe/sdk

Quick start

1. Install the SDK plus whichever LLM SDK your agent uses

2. Add instrumentation (1 line per provider)

3. Configure the API key

4. There is no step 4

Which adapter do I use?

Multi-turn agentic loops

Streaming (Anthropic)

Auto-instrumentation + manual tool results

Manual instrumentation

How it works

Onboarding with Claude Code

Development

License