llm-tps
v0.1.0
Published
CLI to benchmark LLM streaming TPS, TTFT, and thinking-token split
Maintainers
Readme
llm-tps
Benchmark CLI that measures real-world streaming TPS, TTFT, and think vs final token split for LLM APIs, subscriptions, and coding agents.
Default llm-tps bench prints the north-star table across available providers and a max_tokens sweep (200, 500, 1000).
Install
Requires Bun >= 1.1 (runtime for the CLI, even when installing via npm).
npm install -g llm-tps
# or
bun install -g llm-tpsVerify:
llm-tps --versionPrerequisites
| Provider | Preset ID | Auth |
|----------|-----------|------|
| MiniMax Token Plan | minimax-plan | mmx auth login → ~/.mmx/config.json |
| ChatGPT Codex Pro | codex-pro | codex login → ~/.codex/auth.json |
| OpenAI API | openai | OPENAI_API_KEY env var |
| Anthropic API | anthropic | ANTHROPIC_API_KEY env var |
| MiniMax API | minimax-api | MINIMAX_API_KEY env var |
| Cursor SDK | cursor | CURSOR_API_KEY env var + optional @cursor/sdk |
Copy .env.example to .env for API keys. Subscription providers reuse upstream CLI OAuth tokens.
Check what's available:
llm-tps providers
llm-tps providers doctorQuick start
# all available providers, default sweep and workload
llm-tps bench
# one provider, one workload
llm-tps bench --provider codex-pro --workload long --runs 5
# override model or token sweep
llm-tps bench --provider minimax-api --model MiniMax-M3 --max-tokens 200,500,1000
# output formats
llm-tps bench --output json
llm-tps bench --output csv --out results.csv
llm-tps bench --output markdown
# skip warmup call
llm-tps bench --no-warmupMetrics
- TTFT — ms from request start to first streaming
delta - TPS_final — final answer tokens ÷ generation time (excludes thinking)
- TPS_total — all output tokens ÷ generation time (includes reasoning)
- FinalTok / ThinkTok — token counts from provider
usageor think-tag parsing
Headline values are the median of measured runs (default 3). Run 1 is a warmup and discarded unless --no-warmup.
See docs/METRICS.md for formulas and comparison notes.
History
Every successful run appends to ~/.llm-tps-bench/history.jsonl:
llm-tps history
llm-tps history --last 50
llm-tps history --provider codex-pro --since 7d
llm-tps history --trendDevelopment
git clone https://github.com/naufaldi/llm-tps-bench.git
cd llm-tps-bench
bun install
bun test
bun run typecheck
bun run lint
bun run build # produces dist/llm-tps
bun run dev -- bench # run from sourceContributing: fork, branch, ensure bun test passes, open a PR. See AGENTS.md for architecture and conventions.
