npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

argus-bedrock-tracer

v0.1.0

Published

TypeScript SDK for integrating Node.js/Bedrock agents with the argus Python evaluation runner

Readme

@ai-eval/bedrock-tracer

TypeScript SDK for integrating your Node.js / Amazon Bedrock agent with the Argus evaluation and observability dashboard.

The SDK provides two integration modes:

| Mode | When to use | |---|---| | instrument() | Live tracing — wrap an existing BedrockRuntimeClient to stream real-time traces into Argus with zero agent logic changes | | EvalTracer + EvalServer | Batch evaluation — expose an eval endpoint that the Python runner calls to score offline prompt sets |


Live Tracing — instrument()

Installation

npm install
npm run build   # compiles TypeScript → dist/

Usage

import { BedrockRuntimeClient } from "@aws-sdk/client-bedrock-runtime";
import { instrument } from "./dist/instrument";

const client = new BedrockRuntimeClient({ region: "us-east-1" });

const tracer = instrument(client, {
  server:  "http://localhost:7070",  // Argus UI base URL
  apiKey:  "aek_...",               // project API key from Argus Settings
  runName: "My Agent — Production", // optional label shown in the UI
  tags:    ["prod", "v2"],
  agentId: "my-agent",
  verbose: true,                    // log each push to console (default: true)
});

// Use `client` exactly as before — instrument() intercepts transparently
const response = await client.send(converseCommand);

// Stop intercepting when done
tracer.stop();
console.log("run ID:", tracer.runId);

How it works

  1. instrument() patches client.send() to intercept ConverseCommand and ConverseStreamCommand calls.
  2. On the first call of each agent invocation it captures:
    • System prompt from command.input.system
    • Context — the injected content block(s) prepended before the user question (e.g. page URL, user state JSON)
    • User prompt — the actual last user question (last content block in the current turn)
  3. After each LLM step it pushes a trace snapshot to POST /api/ingest — the same run is updated in-place so you see live progress in the dashboard.
  4. When the model returns end_turn the run is marked complete and state is reset for the next invocation.
  5. A conversation ID is auto-derived from a djb2 hash of the system prompt + first user message — the same conversation always gets the same ID across turns with no extra code.

Context vs. User Prompt separation

If your agent prepends an injected context block as a separate content block before the user's question, the SDK detects and separates them automatically:

command.input.messages = [
  {
    role: "user",
    content: [
      { text: "Current page url is https://... user state: {...}" },  // ← context
      { text: "Give me a login overview" }                           // ← user prompt
    ]
  }
]
  • context is sent separately in the ingest payload and shown in a collapsible Context card in the Argus trace detail panel.
  • prompt is the actual user question — shown in the traces table and the User Prompt card.

InstrumentOptions

| Option | Type | Default | Description | |---|---|---|---| | server | string | required | Argus UI base URL, e.g. "http://localhost:7070" | | apiKey | string | — | Project API key — automatically assigns traces to the matching project | | runName | string | "Live — <date>" | Label for the run shown in the UI | | tags | string[] | — | Tags attached to the run | | agentId | string | — | Agent identifier stored in run metadata | | verbose | boolean | true | Log each push to console |

Instrumenter interface

interface Instrumenter {
  readonly runId: string | null;  // eval-server run ID (null until first push succeeds)
  stop(): void;                   // restores the original client.send()
}

Batch Evaluation — EvalTracer + EvalServer

Use this mode when the Python eval runner drives evaluation (it calls your agent server with each prompt from a prompt set).

Python eval runner
    │
    │  POST /eval  { prompt, system_prompt, run_id, prompt_id }
    ▼
EvalServer  (this SDK)
    │
    │  calls your handler
    ▼
EvalTracer.run(prompt, toolHandler, tools, systemPrompt)
    │
    │  Bedrock ConverseCommand loop
    ▼
Bedrock (Claude / Nova / Llama …)
    │
    │  tool_use → toolHandler → tool results
    ▼
EvalTracer returns { content, trace }
    │
    │  POST response  { content, trace: EvalTrace }
    ▼
Python eval runner  ← evaluators run against content + trace

Quick start

import {
  EvalTracer,
  EvalServer,
  ToolHandler,
} from "@ai-eval/bedrock-tracer";
import { Tool } from "@aws-sdk/client-bedrock-runtime";

const tools: Tool[] = [
  {
    toolSpec: {
      name: "calculator",
      description: "Evaluate a simple arithmetic expression",
      inputSchema: {
        json: {
          type: "object",
          properties: {
            expression: { type: "string", description: "e.g. '2 + 2'" },
          },
          required: ["expression"],
        },
      },
    },
  },
];

const toolHandler: ToolHandler = async (name, input) => {
  if (name === "calculator") {
    const result = Function(`"use strict"; return (${input.expression})`)();
    return String(result);
  }
  throw new Error(`Unknown tool: ${name}`);
};

const tracer = new EvalTracer({
  modelId: "anthropic.claude-3-5-sonnet-20241022-v2:0",
  region: "us-east-1",
  maxIterations: 10,
});

const server = new EvalServer({
  port: 3000,
  handler: async (req) => {
    const { content, trace } = await tracer.run(
      req.prompt,
      toolHandler,
      tools,
      req.system_prompt ?? undefined
    );
    return { content, trace };
  },
});

server.start();
// Listening on http://localhost:3000
// POST /eval   — eval endpoint
// GET  /health — health check

EvalTracer options

| Option | Type | Default | Description | |---|---|---|---| | modelId | string | required | Bedrock model ID | | region | string | AWS_REGION env / "us-east-1" | AWS region | | maxIterations | number | 20 | Max agentic loop iterations | | client | BedrockRuntimeClient | auto-created | Bring your own client |

tracer.run(
  prompt: string,
  toolHandler?: ToolHandler,
  tools?: Tool[],
  systemPrompt?: string,
): Promise<{ content: string; trace: EvalTrace }>

EvalServer options

| Option | Type | Default | Description | |---|---|---|---| | port | number | 3000 | Listen port | | evalPath | string | "/eval" | POST endpoint path | | healthPath | string | "/health" | GET health check path | | handler | AgentHandler | required | Your agent function | | logger | Console-like | console | Custom logger |

Connecting to the Python eval runner

from eval_framework.connectors.node_connector import NodeAgentConnector
from eval_framework.runner.runner import EvaluationRunner

connector = NodeAgentConnector(
    base_url="http://localhost:3000",
    timeout_seconds=120,
)

runner = EvaluationRunner(connector=connector, ...)

Shared types

interface EvalTrace {
  steps: StepTrace[];            // one per Converse API call
  toolCallChain: string[];       // ordered list of every tool invoked
  toolsUsed: string[];           // unique tools (ordered by first use)
  totalInputTokens: number;
  totalOutputTokens: number;
  totalToolCalls: number;
  totalLatencyMs: number;
  finalResponse: string;
  terminatedReason: "end_turn" | "max_iterations" | "error" | "unknown";
}

Building

npm run build   # compiles to dist/
npm run dev     # watch mode

AWS credentials

Credentials are resolved by @aws-sdk/client-bedrock-runtime in standard priority order:

  1. AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_SESSION_TOKEN env vars
  2. ~/.aws/credentials profile (AWS_PROFILE env var)
  3. IAM instance / task role