llm-tps

v0.1.0

Published

13 days ago

CLI to benchmark LLM streaming TPS, TTFT, and thinking-token split

0High
0Medium
0Low

naufaldi

llm benchmark tps streaming cli minimax codex

llm-tps

Benchmark CLI that measures real-world streaming TPS, TTFT, and think vs final token split for LLM APIs, subscriptions, and coding agents.

Default llm-tps bench prints the north-star table across available providers and a max_tokens sweep (200, 500, 1000).

Install

Requires Bun >= 1.1 (runtime for the CLI, even when installing via npm).

npm install -g llm-tps
# or
bun install -g llm-tps

Verify:

llm-tps --version

Prerequisites

| Provider | Preset ID | Auth | |----------|-----------|------| | MiniMax Token Plan | minimax-plan | mmx auth login → ~/.mmx/config.json | | ChatGPT Codex Pro | codex-pro | codex login → ~/.codex/auth.json | | OpenAI API | openai | OPENAI_API_KEY env var | | Anthropic API | anthropic | ANTHROPIC_API_KEY env var | | MiniMax API | minimax-api | MINIMAX_API_KEY env var | | Cursor SDK | cursor | CURSOR_API_KEY env var + optional @cursor/sdk |

Copy .env.example to .env for API keys. Subscription providers reuse upstream CLI OAuth tokens.

Check what's available:

llm-tps providers
llm-tps providers doctor

Quick start

# all available providers, default sweep and workload
llm-tps bench

# one provider, one workload
llm-tps bench --provider codex-pro --workload long --runs 5

# override model or token sweep
llm-tps bench --provider minimax-api --model MiniMax-M3 --max-tokens 200,500,1000

# output formats
llm-tps bench --output json
llm-tps bench --output csv --out results.csv
llm-tps bench --output markdown

# skip warmup call
llm-tps bench --no-warmup

Metrics

TTFT — ms from request start to first streaming delta
TPS_final — final answer tokens ÷ generation time (excludes thinking)
TPS_total — all output tokens ÷ generation time (includes reasoning)
FinalTok / ThinkTok — token counts from provider usage or think-tag parsing

Headline values are the median of measured runs (default 3). Run 1 is a warmup and discarded unless --no-warmup.

See docs/METRICS.md for formulas and comparison notes.

History

Every successful run appends to ~/.llm-tps-bench/history.jsonl:

llm-tps history
llm-tps history --last 50
llm-tps history --provider codex-pro --since 7d
llm-tps history --trend

Development

git clone https://github.com/naufaldi/llm-tps-bench.git
cd llm-tps-bench
bun install
bun test
bun run typecheck
bun run lint
bun run build          # produces dist/llm-tps
bun run dev -- bench   # run from source

Contributing: fork, branch, ensure bun test passes, open a PR. See AGENTS.md for architecture and conventions.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

llm-tps

Install

Prerequisites

Quick start

Metrics

History

Development

License