npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

llm-metrics

v0.7.0

Published

Metrics collection system for LLMs and AI agents. Tracks performance, latency, and usage metrics for agents, tools, and LLM requests.

Readme

llm-metrics

npm version npm downloads CI License: MIT TypeScript Node.js Bun

Metrics collection system for LLMs and AI agents

Track performance, latency, and usage metrics for agents, tools, and LLM requests. Perfect for monitoring LLM applications, AI agents, and agentic systems.

InstallationQuick StartDocumentationContributing


A professional, framework-agnostic metrics collection system designed specifically for LLM applications and AI agents. Built with TypeScript, featuring type-safe APIs, comprehensive validation, and flexible persistence backends.

✨ Features

  • 🚀 Framework-agnostic - Works with any JavaScript/TypeScript project (Next.js, Express, Hono, etc.)
  • 📊 Multiple metric types - Track agents, tools, latency, and request timing
  • 💾 Flexible persistence - In-memory by default, pluggable persistence backends (PostgreSQL, MongoDB, Redis)
  • Type-safe - Full TypeScript support with strict types and IntelliSense
  • 🔍 Validation - Built-in metric validation (configurable, prevents invalid data)
  • 📈 Aggregations - Built-in summary statistics, percentiles, and histograms
  • 🎨 Formatting - Human-readable metric formatting utilities for logging
  • 🔌 Extensible - Custom persistence backends, loggers, and event hooks
  • Zero dependencies - No runtime dependencies, lightweight and fast
  • 🔎 Query API - Flexible filtering by context, time range, duration, metadata, etc.
  • 📦 Batch operations - Efficient batch recording for migrations and imports
  • 📊 Derived metrics - Rate calculations, error rates, and trend analysis
  • 🧪 Well tested - Comprehensive test suite (154 tests, 290+ assertions)
  • 📦 ESM-only - Modern JavaScript, no CommonJS legacy code

📦 Installation

Install llm-metrics from npm:

npm install llm-metrics

Or using your preferred package manager:

# Bun
bun add llm-metrics

# Yarn
yarn add llm-metrics

# pnpm
pnpm add llm-metrics

Requirements

  • Node.js >= 22.0.0 (LTS) or Bun >= 1.3.0
  • ESM-only - This package uses ES Modules only (no CommonJS support)
  • TypeScript 5.6+ (recommended for type safety)

🚀 Quick Start

Get started with llm-metrics in under 2 minutes:

import { metricsCollector, measureAgent, measureTool } from 'llm-metrics';

// Measure an agent execution (e.g., LLM agent, AI assistant)
const result = await measureAgent(
  'memory-manager',        // Agent identifier
  'conversation-123',     // Context ID (conversation, session, etc.)
  async () => {
    // Your agent code here
    const facts = await extractFacts();
    return { facts, count: facts.length };
  }
);

// Measure a tool execution (e.g., database query, API call)
const toolResult = await measureTool(
  'search-database',      // Tool name
  'conversation-123',      // Context ID
  async () => {
    // Your tool code here
    return await db.query('SELECT * FROM users');
  }
);

// Get summary statistics
const summary = metricsCollector.getSummary(3600000); // Last hour
console.log(`Agents executed: ${summary.totalAgentsExecuted}`);
console.log(`Average duration: ${summary.averageAgentDuration}ms`);
console.log(`Tools called: ${summary.totalToolsCalled}`);

📚 Core Concepts

Metrics Types

llm-metrics supports four types of metrics optimized for LLM and AI agent workflows:

  1. 🤖 Agent Metrics - Track execution of AI agents, LLM calls, or long-running processes

    • Duration, success/failure, custom metadata
    • Perfect for monitoring agent performance and reliability
  2. 🔧 Tool Metrics - Track individual tool/function calls (function calling, RAG queries, etc.)

    • Success rate, execution time, error tracking
    • Essential for debugging tool usage in agentic systems
  3. ⏱️ Latency Metrics - Track specific operations or bottlenecks

    • Embedding generation, vector search, cache lookups
    • Identify performance bottlenecks in your LLM pipeline
  4. 📡 Request Timing Metrics - Track client vs server timing for requests

    • Client-side latency, server processing time, streaming duration
    • Understand end-to-end user experience

Storage Architecture

  • In-memory - Fast access, limited by maxMetrics (default: 1000)

    • Perfect for real-time monitoring and debugging
    • Automatically rotates oldest metrics when limit reached
  • Persistence - Optional backend for long-term storage

    • PostgreSQL, MongoDB, Redis, or any custom backend
    • Implement MetricsPersistence interface for your database

💡 Usage Examples

Use Cases

Perfect for:

  • LLM Applications - Monitor GPT-4, Claude, Gemini API calls
  • AI Agents - Track agent execution, tool usage, and performance
  • RAG Systems - Measure vector search, embedding generation latency
  • Agentic Workflows - Monitor multi-step agent operations
  • Production Monitoring - Track metrics in production LLM applications

📖 Usage Examples

Basic Agent Tracking

import { measureAgent } from 'llm-metrics';

const result = await measureAgent(
  'data-processor',
  'session-123',
  async () => {
    // Process data
    const processed = await processData();
    return processed;
  }
);

Agent Tracking with Custom Metadata

import { measureAgentWithMetrics } from 'llm-metrics';

const result = await measureAgentWithMetrics(
  'memory-manager',
  'conversation-456',
  async () => {
    const facts = await extractFacts();
    return { facts, count: facts.length };
  },
  (result) => ({
    factsExtracted: result.count,
    summaryLength: result.summary?.length || 0,
  })
);

Tool Tracking

import { measureTool } from 'llm-metrics';

const result = await measureTool(
  'database-query',
  'request-789',
  async () => {
    return await db.query('SELECT * FROM users');
  }
);

Manual Metric Recording

import { metricsCollector } from 'llm-metrics';

// Record agent metrics manually
metricsCollector.recordAgent({
  agentId: 'custom-agent',
  contextId: 'context-123',
  startTime: Date.now() - 5000,
  endTime: Date.now(),
  duration: 5000,
  metadata: {
    customField: 'value',
    itemsProcessed: 42,
  },
});

// Record latency metrics
metricsCollector.recordLatency({
  operation: 'cache-lookup',
  startTime: Date.now() - 100,
  endTime: Date.now(),
  duration: 100,
  metadata: {
    cacheHit: true,
  },
});

Request Timing (Client vs Server)

import { metricsCollector } from 'llm-metrics';

metricsCollector.recordRequestTiming({
  contextId: 'request-123',
  serverTimeToFirstChunk: 500,
  serverStreamDuration: 2000,
  serverTotalDuration: 2500,
  clientTimeToFirstChunk: 800, // From Performance API
  clientRequestStart: performance.now(),
  networkLatencyEstimate: 300, // client - server difference
  metadata: {
    model: 'gpt-4',
    messageCount: 5,
  },
});

⚙️ Configuration

Customize llm-metrics to fit your needs:

Custom Persistence Backend

import { MetricsPersistence, metricsCollector } from 'llm-metrics';
import type { AgentMetrics, ToolMetrics, LatencyMetrics, RequestTimingMetrics } from 'llm-metrics';

class MyDatabasePersistence implements MetricsPersistence {
  async persistAgentMetrics(metrics: AgentMetrics): Promise<void> {
    // Save to your database
    await db.insert('agent_metrics', metrics);
  }

  async persistToolMetrics(metrics: ToolMetrics): Promise<void> {
    await db.insert('tool_metrics', metrics);
  }

  async persistLatencyMetrics(metrics: LatencyMetrics): Promise<void> {
    await db.insert('latency_metrics', metrics);
  }

  async persistRequestTimingMetrics(metrics: RequestTimingMetrics): Promise<void> {
    await db.insert('request_timing_metrics', metrics);
  }

  async getAgentMetrics(timeRangeMs?: number, contextId?: string): Promise<AgentMetrics[]> {
    // Retrieve from database
    return await db.query('SELECT * FROM agent_metrics WHERE ...');
  }

  // ... implement other get methods
}

// Configure persistence
metricsCollector.setPersistence(new MyDatabasePersistence());

Custom Logger

import { MetricsLogger, metricsCollector } from 'llm-metrics';

class MyLogger implements MetricsLogger {
  info(message: string, data?: Record<string, unknown>): void {
    console.log(`[INFO] ${message}`, data);
  }

  debug(message: string, data?: Record<string, unknown>): void {
    console.debug(`[DEBUG] ${message}`, data);
  }

  warn(message: string, data?: Record<string, unknown>): void {
    console.warn(`[WARN] ${message}`, data);
  }

  error(message: string, data?: Record<string, unknown>): void {
    console.error(`[ERROR] ${message}`, data);
  }
}

metricsCollector.setLogger(new MyLogger());

Collector Configuration

import { MetricsCollector, MetricsCollectorConfig } from 'llm-metrics';

const config: MetricsCollectorConfig = {
  maxMetrics: 5000, // Keep more metrics in memory
  validateMetrics: true, // Enable validation (default)
  throwOnValidationError: false, // Don't throw, just log (default)
};

const customCollector = new MetricsCollector(undefined, undefined, config);

// Or configure existing collector
metricsCollector.configure({
  maxMetrics: 2000,
});

API Reference

MetricsCollector

Methods

  • recordAgent(metrics: AgentMetrics): void - Record agent metrics
  • recordTool(metrics: ToolMetrics): void - Record tool metrics
  • recordLatency(metrics: LatencyMetrics): void - Record latency metrics
  • recordRequestTiming(metrics: RequestTimingMetrics): void - Record request timing
  • getSnapshot(): MetricsSnapshot - Get all current metrics
  • getSummary(timeRangeMs?: number): MetricsSummary - Get aggregated statistics
  • getContextMetrics(contextId: string): Promise<...> - Get metrics for a context
  • clear(): void - Clear all metrics
  • setPersistence(persistence: MetricsPersistence): void - Configure persistence
  • setLogger(logger: MetricsLogger): void - Configure logger
  • configure(config: Partial<MetricsCollectorConfig>): void - Update configuration

Helper Functions

  • measureAgent<T>(agentId, contextId?, execute): Promise<T> - Measure agent execution
  • measureAgentWithMetrics<T>(agentId, contextId, execute, extractMetadata): Promise<T> - Measure with metadata extraction
  • measureTool<T>(toolName, contextId, execute): Promise<T> - Measure tool execution
  • measureToolWithMetadata<T>(toolName, contextId, execute, extractMetadata): Promise<T> - Measure tool with metadata

Formatting Utilities

  • formatDuration(ms: number): string - Format duration (e.g., "1.5s", "2m 5s")
  • formatDurationDetailed(ms: number): string - Detailed duration format
  • formatAgentMetrics(metrics: AgentMetrics): string - Human-readable agent metrics
  • formatToolMetrics(metrics: ToolMetrics): string - Human-readable tool metrics
  • formatLatencyMetrics(metrics: LatencyMetrics): string - Human-readable latency metrics
  • formatMetricsSummary(summary: MetricsSummary): string - Human-readable summary

Validation

  • validateAgentMetrics(metrics: AgentMetrics): ValidationResult - Validate agent metrics
  • validateToolMetrics(metrics: ToolMetrics): ValidationResult - Validate tool metrics
  • validateLatencyMetrics(metrics: LatencyMetrics): ValidationResult - Validate latency metrics
  • validateRequestTimingMetrics(metrics: RequestTimingMetrics): ValidationResult - Validate request timing

Types

AgentMetrics

interface AgentMetrics {
  agentId: string;
  contextId?: string; // Generic context ID (conversationId, sessionId, requestId, etc.)
  startTime: number; // Timestamp in milliseconds
  endTime?: number; // Timestamp in milliseconds
  duration?: number; // Duration in milliseconds
  metadata?: Record<string, unknown>; // Custom metadata
  error?: string; // Error message if failed
}

ToolMetrics

interface ToolMetrics {
  toolName: string;
  contextId?: string;
  startTime: number;
  endTime?: number;
  duration?: number;
  success: boolean;
  error?: string;
  metadata?: Record<string, unknown>;
}

LatencyMetrics

interface LatencyMetrics {
  operation: string;
  startTime: number;
  endTime: number;
  duration: number;
  metadata?: Record<string, unknown>;
}

RequestTimingMetrics

interface RequestTimingMetrics {
  contextId?: string;
  serverTimeToFirstChunk: number; // milliseconds
  serverStreamDuration: number; // milliseconds
  serverTotalDuration: number; // milliseconds
  clientTimeToFirstChunk?: number; // milliseconds (from Performance API)
  clientRequestStart?: number; // performance.now() timestamp
  networkLatencyEstimate?: number; // milliseconds
  metadata?: Record<string, unknown>;
}

Examples

See the examples/ directory for complete, runnable examples:

Advanced Usage

Event Hooks

Use event hooks to integrate with external systems, dashboards, or alerting:

import { metricsCollector } from 'llm-metrics';

// Set up callbacks
metricsCollector.setCallbacks({
  onAgentRecorded: (metrics) => {
    // Send to monitoring service, update dashboard, etc.
    console.log('Agent executed:', metrics.agentId, metrics.duration);
  },
  onToolRecorded: (metrics) => {
    // Track tool usage, alert on failures, etc.
    if (!metrics.success) {
      console.error('Tool failed:', metrics.toolName);
    }
  },
});

// Or configure during construction
const collector = new MetricsCollector(persistence, logger, {
  callbacks: {
    onAgentRecorded: (metrics) => { /* ... */ },
    onToolRecorded: (metrics) => { /* ... */ },
  },
});

See examples/event-hooks.ts for complete examples.

Query and Filter API

Query metrics with flexible filter criteria:

import { metricsCollector } from 'llm-metrics';

// Filter by multiple context IDs
const metrics = metricsCollector.queryMetrics({
  contextIds: ['session-123', 'session-456'],
});

// Filter by agent IDs
const agentMetrics = metricsCollector.queryMetrics({
  agentIds: ['data-processor'],
});

// Filter by time range
const recentMetrics = metricsCollector.queryMetrics({
  startTime: Date.now() - 3600000, // Last hour
  endTime: Date.now(),
});

// Filter by duration range
const slowMetrics = metricsCollector.queryMetrics({
  minDuration: 5000, // Slower than 5 seconds
});

// Filter by metadata
const dataMetrics = metricsCollector.queryMetrics({
  metadata: { category: 'data' },
});

// Combine multiple filters
const complexFilter = metricsCollector.queryMetrics({
  contextIds: ['session-123'],
  minDuration: 1000,
  maxDuration: 5000,
  metadata: { category: 'data' },
});

See examples/query-filter.ts for complete examples.

Batch Operations

Record multiple metrics efficiently in batch:

import { metricsCollector } from 'llm-metrics';

// Record multiple agents in batch
metricsCollector.recordAgents([
  { agentId: 'agent-1', startTime: Date.now(), /* ... */ },
  { agentId: 'agent-2', startTime: Date.now(), /* ... */ },
]);

// Record multiple tools in batch
metricsCollector.recordTools([
  { toolName: 'tool-1', startTime: Date.now(), success: true, /* ... */ },
  { toolName: 'tool-2', startTime: Date.now(), success: false, /* ... */ },
]);

// Record multiple latency metrics in batch
metricsCollector.recordLatencies([
  { operation: 'op-1', startTime: Date.now() - 100, endTime: Date.now(), duration: 100 },
  { operation: 'op-2', startTime: Date.now() - 50, endTime: Date.now(), duration: 50 },
]);

// Record multiple request timings in batch
metricsCollector.recordRequestTimings([
  { contextId: 'req-1', serverTimeToFirstChunk: 500, serverStreamDuration: 2000, serverTotalDuration: 2500 },
  { contextId: 'req-2', serverTimeToFirstChunk: 300, serverStreamDuration: 1000, serverTotalDuration: 1300 },
]);

Batch operations are useful for:

  • Migrating metrics from another system
  • Importing historical data
  • Bulk operations
  • More efficient than individual record*() calls

See examples/batch-operations.ts for complete examples.

Derived Metrics

Calculate simple derived metrics like rates and trends:

import { calculateAgentDerivedMetrics, calculateToolDerivedMetrics, calculateTrend } from 'llm-metrics';

const snapshot = metricsCollector.getSnapshot();

// Calculate agent derived metrics
const agentDerived = calculateAgentDerivedMetrics(snapshot.agents, 3600000); // Last hour
console.log(`Error Rate: ${agentDerived.errorRate}%`);
console.log(`Requests/Second: ${agentDerived.requestsPerSecond}`);

// Calculate tool derived metrics
const toolDerived = calculateToolDerivedMetrics(snapshot.tools, 3600000);
console.log(`Success Rate: ${toolDerived.successRate}%`);

// Calculate trends
const trend = calculateTrend(currentRate, previousRate);
console.log(`Change: ${trend.changePercent}%`);

Available derived metrics:

  • Rates: Requests per second, operations per second
  • Error Rates: Error percentage, success percentage
  • Trends: Change between time periods, percentage change

See examples/derived-metrics.ts for complete examples.

Custom Metadata Extraction

import { measureAgentWithMetrics } from 'llm-metrics';

const result = await measureAgentWithMetrics(
  'data-processor',
  'batch-123',
  async () => {
    const data = await processBatch();
    return {
      items: data.items,
      errors: data.errors,
      stats: data.stats,
    };
  },
  (result) => ({
    itemsProcessed: result.items.length,
    errorCount: result.errors.length,
    averageScore: result.stats.averageScore,
    customMetric: result.stats.customValue,
  })
);

Time-Range Filtering

import { metricsCollector } from 'llm-metrics';

// Last hour
const lastHour = metricsCollector.getSummary(3600000);

// Last 24 hours
const lastDay = metricsCollector.getSummary(86400000);

// All time
const allTime = metricsCollector.getSummary();

Context-Based Queries

import { metricsCollector } from 'llm-metrics';

// Get all metrics for a specific context (conversation, session, etc.)
const contextMetrics = await metricsCollector.getContextMetrics('conversation-123');

console.log(`Agents: ${contextMetrics.agents.length}`);
console.log(`Tools: ${contextMetrics.tools.length}`);
console.log(`Latency operations: ${contextMetrics.latency.length}`);

Best Practices

  1. Use context IDs - Always provide contextId to track metrics across operations
  2. Extract meaningful metadata - Use metadata to store domain-specific information
  3. Configure persistence - For production, use a persistence backend
  4. Enable validation - Keep validation enabled to catch errors early
  5. Monitor memory usage - Adjust maxMetrics based on your needs
  6. Use helper functions - Prefer measureAgent/measureTool over manual recording

Performance Considerations

  • In-memory storage - Fast but limited by maxMetrics (default: 1000)
  • Persistence is async - Persistence operations don't block metric recording
  • Validation overhead - Can be disabled for maximum performance if needed
  • FIFO eviction - Oldest metrics are removed when limit is reached

🔨 Building with Bun

This package is fully compatible with Bun and can be bundled directly:

# Bundle with Bun
bun build ./src/index.ts --outdir ./dist --target bun

# Or use Bun's bundler in your project
bun build node_modules/llm-metrics/dist/index.js --outdir ./bundled

🛠️ Technical Details

Modern JavaScript Only

This package uses ES Modules (ESM) only:

  • ✅ ES2022+ syntax
  • ✅ Native ESM imports/exports
  • ✅ Compatible with Bun 1.3+, Node.js 22+ (LTS), Deno
  • ❌ No CommonJS support
  • ❌ No legacy browser support

Requirements:

  • Node.js >= 22.0.0 (LTS)
  • Bun >= 1.3.0

📊 Project Status

  • v0.6.0 - Latest release
  • 154 tests passing (290+ assertions)
  • ~95% code coverage (comprehensive edge case coverage)
  • 100% TypeScript type coverage
  • ESM-only (modern JavaScript)
  • Zero dependencies (runtime)

🤝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

🔌 Creating Custom Adapters

Want to create your own persistence adapter? See src/adapters/README.md for:

  • Adapter interface documentation
  • PostgreSQL adapter example
  • MongoDB adapter example
  • Redis adapter example
  • Best practices and testing guidelines
  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Setup

# Clone the repository
git clone https://github.com/Arakiss/llm-metrics.git
cd llm-metrics

# Install dependencies
bun install

# Run tests
bun test

# Build
bun run build

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • Built for the LLM and AI agent ecosystem
  • Inspired by the need for better observability in agentic systems
  • Designed with performance and developer experience in mind

🔗 Links

  • npm: https://www.npmjs.com/package/llm-metrics
  • GitHub: https://github.com/Arakiss/llm-metrics
  • Issues: https://github.com/Arakiss/llm-metrics/issues
  • Releases: https://github.com/Arakiss/llm-metrics/releases
  • Changelog: https://github.com/Arakiss/llm-metrics/blob/main/CHANGELOG.md
  • Contributing: https://github.com/Arakiss/llm-metrics/blob/main/CONTRIBUTING.md
  • Security: https://github.com/Arakiss/llm-metrics/blob/main/SECURITY.md

Made with ❤️ for the LLM community

⭐ Star on GitHub📦 Install from npm🐛 Report Bug