npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

llm-speed-bench

v1.5.2

Published

A CLI tool to benchmark the performance of OpenAI-compatible LLM providers.

Readme

LLM Speed Bench

llm-speed-bench is a command-line interface (CLI) tool for benchmarking the performance of Large Language Model (LLM) providers that offer an OpenAI-compatible API.

It is designed to provide detailed, actionable data on the output speed and latency characteristics of different models and providers. It measures key performance indicators from the moment a request is sent until the final token of the response is received, with a focus on streaming APIs.

Features

  • OpenAI-Compatible: Works with any API that adheres to the OpenAI specification for streaming chat completions.
  • Streaming First: Benchmarks performance by leveraging the provider's streaming API to get detailed timing data.
  • Detailed Performance Metrics: Collects and calculates a comprehensive set of metrics, including token counts, time to first token, inter-token latency, and overall throughput.
  • ASCII Graphs: Visualize TPS and inter-token latency over time with --graph option.
  • Flexible Configuration: Manage inputs via both command-line arguments and environment variables.
  • Multiple Output Formats: Presents results in a clean, human-readable format, with an option for machine-readable JSON.

Installation

Bun (Recommended)

bun install

Running

bun run src/index.ts [options]

Usage

Configuration can be provided through command-line arguments or environment variables.

Configuration

| Parameter | CLI Argument | Environment Variable | Required | Description | | :--- | :--- | :--- | :--- | :--- | | API Base URL | --api-base-url <url> | LLM_API_URL | Yes | The base URL for the OpenAI-compatible API. | | API Key | --api-key <key> | LLM_API_KEY | Yes | The authentication key for the API. | | Model Name | --model <name> | LLM_MODEL_NAME | Yes | The specific model to be benchmarked (e.g., gpt-4o). | | Prompt | --prompt <text> | LLM_PROMPT | Yes | The input text to send to the model. |

Examples

Using Command-Line Arguments

bun run src/index.ts \
  --api-base-url "https://api.openai.com/v1" \
  --api-key "sk-..." \
  --model "gpt-4o" \
  --prompt "Tell me a short story about a robot who discovers music."

Using Environment Variables

export LLM_API_URL="https://api.openai.com/v1"
export LLM_API_KEY="sk-..."
export LLM_MODEL_NAME="gpt-4o"
export LLM_PROMPT="Tell me a short story about a robot who discovers music."

bun run src/index.ts

Getting JSON Output

bun run src/index.ts --json > results.json

Showing ASCII Graphs

bun run src/index.ts --graph

This displays two ASCII graphs:

  • TPS Over Time: Tokens per second throughout the response
  • Inter-Token Latency Over Time: Latency between each token

The graphs automatically adjust to your terminal width.

Output Format

Standard Output

The default output is a human-readable summary:

LLM Benchmark Results
=======================

Configuration
-----------------------
Provider API Base:   https://api.groq.com/openai
Model:               llama3-70b-8192

Metrics
-----------------------
Time to First Token:   152 ms
Total Wall Clock Time: 2,130 ms
Overall Output Rate:   234.7 tokens/sec

Token Counts
-----------------------
Prompt Tokens:         35 (estimated)
Output Tokens:         450

Inter-Token Latency (ms)
-----------------------
Min:                 2 ms
Mean:                4.1 ms
Median:              4 ms
Max:                 15 ms
p90:                 6 ms
p95:                 8 ms
p99:                 12 ms

JSON Output (--json)

The JSON output includes all the calculated metrics and configuration details.

{
  "configuration": {
    "apiBaseUrl": "https://api.groq.com/openai",
    "model": "llama3-70b-8192"
  },
  "metrics": {
    "timeToFirstTokenMs": 152,
    "totalWallClockTimeMs": 2130,
    "overallOutputRateTps": 234.7
  },
  "tokenCounts": {
    "promptTokens": 35,
    "outputTokens": 450
  },
  "interTokenLatencyMs": {
    "min": 2,
    "mean": 4.1,
    "median": 4,
    "max": 15,
    "p90": 6,
    "p95": 8,
    "p99": 12
  }
}

Development

Running with ts-node

To run the tool in development mode without building, you can use ts-node:

npx ts-node src/index.ts --api-base-url ...

Local Installation and Testing

To test the CLI locally as if it were globally installed, you can use npm link. This is the best way to test the final command-line experience before publishing.

  1. Build the project: Make sure your latest changes are compiled.

    npm run build
  2. Link the package: This creates a global symbolic link to your local project.

    npm link
  3. Run the command globally: You can now run the command from any directory.

    llm-speed-bench --api-base-url "..." --api-key "..."
  4. Rebuild after changes: Whenever you change the source code, just re-run the build command. The symbolic link will ensure your global command always uses the latest compiled code.

    npm run build
  5. Unlink the package: When you're done with local testing, you can remove the global link.

    npm unlink llm-speed-bench