npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@jsleekr/reqbench

v1.0.0

Published

API benchmarking with A/B comparison and statistical significance testing

Readme

⚡ reqbench

API benchmarking with statistical comparison

GitHub Stars License TypeScript Tests Node

Measure latency percentiles, compare endpoints with Welch's t-test, and automate performance gates in CI

Why This Exists | Quick Start | Commands | Example Output | Scenarios | CI Integration


Why This Exists

Most load testing tools give you numbers but not answers. You get a p99 and a mean -- but is version B actually faster than version A, or did noise just fall your way this run?

reqbench goes further -- when you compare two endpoints, it runs Welch's t-test and tells you whether the difference is statistically significant or just noise. A warm-up phase discards early results so connection pool effects and DNS caches don't skew your measurements. Scenarios let you chain multi-step auth flows with variable extraction. And CI-friendly exit codes mean you can block a deploy when latency regresses past a threshold.

  • Welch's t-test A/B comparison -- tells you whether the difference is real or noise, not just bigger or smaller
  • Warm-up phase -- discards early requests to eliminate connection pool and JIT effects before measurement
  • Multi-step YAML scenarios -- chain requests with variable extraction for auth flows and stateful endpoints
  • Zero heavy dependencies -- only commander and js-yaml; no Rust binaries, no native modules

Requirements

  • Node.js >= 18.0.0

Quick Start

# Install globally
npm install -g reqbench

# Benchmark a single endpoint
reqbench run https://api.example.com/health

# A/B compare two endpoints
reqbench compare https://api-v1.example.com/data https://api-v2.example.com/data

# With options
reqbench run https://api.example.com/users \
  -c 50 \
  -d 30 \
  -m POST \
  -H "Authorization: Bearer TOKEN" \
  -f json

Commands

reqbench run <url>

Benchmark a single endpoint and display latency percentiles, RPS, error rate, and a histogram.

| Option | Description | Default | |--------|-------------|---------| | -c, --concurrency <n> | Concurrent connections | 10 | | -d, --duration <seconds> | Test duration in seconds | 10 | | -m, --method <method> | HTTP method (GET/POST/PUT/PATCH/DELETE/HEAD/OPTIONS) | GET | | -H, --header <header> | HTTP header in Key: Value format (repeatable) | -- | | -b, --body <body> | Request body string | -- | | -w, --warmup <seconds> | Warm-up duration (results discarded) | 2 | | -t, --timeout <ms> | Per-request timeout in milliseconds | 5000 | | -f, --format <format> | Output format: terminal, json, markdown | terminal | | -p, --profile <name> | Load options from a saved profile | -- |


reqbench compare <url1> <url2>

Benchmark both endpoints under identical conditions and compare the results with Welch's t-test. Reports p-value, statistical significance, and winner.

reqbench compare https://api-v1.example.com/endpoint https://api-v2.example.com/endpoint

# With higher concurrency and longer duration for more reliable results
reqbench compare https://v1.example.com/api https://v2.example.com/api -c 20 -d 60

# JSON output for CI pipelines
reqbench compare https://v1.example.com https://v2.example.com -f json

reqbench scenario <file>

Run a multi-step scenario from a YAML file. Supports variable extraction between steps for auth flows, token-based workflows, and any multi-request sequence.

reqbench scenario auth-flow.yaml
reqbench scenario api-workflow.yaml -c 10 -d 30

reqbench profile save|list|delete

Manage named connection profiles to avoid repeating common options.

# Save a profile
reqbench profile save myapi -u https://api.example.com -m POST -H "Authorization: Bearer TOKEN"

# List saved profiles
reqbench profile list

# Use a profile in a benchmark
reqbench run https://api.example.com/endpoint -p myapi

# Delete a profile
reqbench profile delete myapi

Profiles are stored as JSON files in ~/.reqbench/profiles/. Profile names are validated to prevent path traversal.


Example Output

Terminal (default)

  URL:         https://api.example.com/health
  Duration:    10.02s
  Requests:    4523
  RPS:         451.40
  Error Rate:  0.00%

  Latency (ms):
    p50:   18.32
    p95:   45.67
    p99:   98.41
    mean:  22.14
    stdev: 15.82
    min:   3.21
    max:   152.88

  Histogram:
    0.0-30.0ms           ########################## 3200 (70.8%)
    30.0-60.0ms          ########                    900 (19.9%)
    60.0-90.0ms          ###                         320 ( 7.1%)
    90.0-150.0ms         #                           103 ( 2.3%)

A/B Comparison

  A/B Comparison
  ─────────────────────────────────────────────────────
  Metric               A               B               Diff
  ─────────────────────────────────────────────────────
  p50 (ms)             18.32           42.15           -23.83
  p95 (ms)             45.67           89.24           -43.57
  p99 (ms)             98.41          156.30           -57.89
  RPS                 451.40          220.38          +231.02
  Mean (ms)            22.14           48.33           -26.19
  Stdev (ms)           15.82           32.70           -16.88
  Error Rate            0.00            0.00             0.00
  ─────────────────────────────────────────────────────
  p-value: 0.000012
  Significant: Yes
  Winner: A

  A is faster than B by 56.5% (p=0.0000).

JSON output

reqbench run https://api.example.com/health -f json
{
  "url": "https://api.example.com/health",
  "duration": 10.02,
  "requests": 4523,
  "rps": 451.40,
  "errorRate": 0.00,
  "latency": {
    "p50": 18.32,
    "p95": 45.67,
    "p99": 98.41,
    "mean": 22.14,
    "stdev": 15.82,
    "min": 3.21,
    "max": 152.88
  }
}

Scenario Files

Create a YAML file to describe multi-step workflows. Variable extraction lets you pass values (such as auth tokens) between steps.

name: Auth Flow
concurrency: 5
duration: 30
steps:
  - name: Login
    url: https://api.example.com/auth/login
    method: POST
    body: '{"username":"test","password":"pass"}'
    extract:
      token: token

  - name: Get Profile
    url: https://api.example.com/users/me
    method: GET
    headers:
      Authorization: "Bearer {{token}}"

  - name: Update Profile
    url: https://api.example.com/users/me
    method: PUT
    headers:
      Authorization: "Bearer {{token}}"
    body: '{"displayName":"Test User"}'

The extract map reads a field from the JSON response body and stores it as a variable. Use {{variableName}} in subsequent steps.


CI Integration

Block deploys on latency regression

name: Performance Gate

on:
  pull_request:
    branches: [main]

jobs:
  bench:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Start staging server
        run: npm start &

      - name: Compare PR vs main latency
        run: |
          npx reqbench compare \
            https://main.example.com/api \
            https://staging.example.com/api \
            -c 20 -d 30 -f json > result.json

      - name: Check regression
        run: |
          P50_DIFF=$(cat result.json | jq '.comparison.p50Diff')
          if (( $(echo "$P50_DIFF > 20" | bc -l) )); then
            echo "p50 latency regressed by ${P50_DIFF}ms -- blocking merge"
            exit 1
          fi

Post benchmark results as PR comment

      - name: Run benchmark
        id: bench
        run: |
          OUTPUT=$(npx reqbench run https://api.example.com/health -f markdown)
          echo "report<<EOF" >> $GITHUB_OUTPUT
          echo "$OUTPUT" >> $GITHUB_OUTPUT
          echo "EOF" >> $GITHUB_OUTPUT

      - name: Comment on PR
        uses: actions/github-script@v7
        with:
          script: |
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `## Benchmark Results\n\n${{ steps.bench.outputs.report }}`
            })

Exit Codes

| Code | Meaning | |------|---------| | 0 | Success -- benchmark completed normally | | 1 | Comparison result -- B is faster (useful for A/B regression checks) | | 2 | Validation error -- invalid URL, method, headers | | 3 | Runtime error -- connection refused, timeout, etc. |


How It Works

Request Phase       Warmup Filter      Measurement        Statistics         Output
─────────────       ─────────────      ─────────────      ─────────────      ──────────
concurrent      →   discard first   →  collect           →  p50/p95/p99    →  terminal
workers             W seconds          latencies             mean/stdev        json
                                                             RPS               markdown
                                                             error rate
                                                             Welch's t-test
                                                             (compare mode)
  1. Warm-up -- Fires requests for the configured warm-up period and discards results. Eliminates connection pool startup, DNS resolution, and server-side JIT effects.
  2. Measurement -- Concurrent workers fire requests for the configured duration. Each latency sample is recorded with microsecond precision.
  3. Statistics -- Percentiles are computed from the collected sample array. Standard deviation uses Welch's online algorithm. For compare mode, Welch's t-test evaluates significance.
  4. Histogram -- Built with a loop-based accumulator safe for 500K+ samples (no Math.min(...array) stack overflow).
  5. Output -- Results are formatted and written to stdout. Exit codes reflect outcome for CI consumption.

Architecture

src/
  types.ts        # Core types (BenchResult, CompareResult, ScenarioStep, etc.)
  errors.ts       # ReqBenchError, ValidationError with descriptive hints
  bench.ts        # Single endpoint benchmark engine
  compare.ts      # A/B comparison with Welch's t-test
  scenario.ts     # Multi-step YAML scenario runner
  reporter.ts     # Output formatters (terminal, json, markdown)
  profile.ts      # Profile save/load/delete with path traversal protection
  validation.ts   # URL, method, header, profile name validators
  stats.ts        # Statistics utilities (percentiles, stdev, t-test)
  cli.ts          # CLI entry point
  index.ts        # Public re-exports

Security

  • URL validation -- rejects non-HTTP protocols (ftp://, file://, javascript:), embedded credentials, and URLs over 2048 characters
  • Header injection prevention -- rejects headers containing CRLF (\r\n)
  • Method whitelisting -- only GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS are accepted
  • Profile path traversal -- profile names containing ../ or special characters are rejected before any filesystem operation

FAQ

Q: How does the statistical comparison work? A: reqbench uses Welch's t-test (two-tailed) to compare the latency distributions of both endpoints. A p-value below 0.05 indicates a statistically significant difference. Welch's variant (rather than Student's) accounts for unequal sample sizes and variances. See docs/advanced-guide.md for the full formula and interpretation guide.

Q: What does "Winner: tie" mean? A: No statistically significant difference was detected. The observed latency gap could be due to random variation. Run with -d 60 or -c 50 to collect more samples for a more reliable result.

Q: Can I benchmark HTTPS endpoints? A: Yes. reqbench automatically detects the protocol from the URL and uses the appropriate Node.js http or https module.

Q: Does the warm-up phase affect results? A: No. The warm-up phase fires requests but discards all latency samples. Only measurements after the warm-up window are included in statistics.

Q: Why does RPS drop with very high concurrency? A: At high concurrency the server becomes the bottleneck, not the client. The measured RPS accurately reflects the server's throughput limit -- this is expected behavior, not a bug in reqbench.

Q: Can I run reqbench against localhost? A: Yes. reqbench run http://localhost:3000/health works normally. Use a warm-up period to allow your local server to reach steady state before measuring.


Troubleshooting

| Problem | Likely Cause | Solution | |---------|--------------|----------| | Request timeout | Endpoint too slow for default timeout | Increase with -t 10000 (10s) | | ECONNREFUSED | Server not running or wrong port | Verify the URL and that the server is accepting connections | | ENOTFOUND | DNS resolution failed | Check the hostname; verify network connectivity | | Error Rate: 100% | All requests failing | Check URL, method, headers, and server logs | | 0 requests in short tests | Duration too short for slow endpoints | Increase -d (e.g., -d 30) | | Low RPS with high concurrency | Server is the bottleneck | This is accurate -- check server resources | | DEPTH_ZERO_SELF_SIGNED_CERT | Self-signed SSL certificate | Set NODE_TLS_REJECT_UNAUTHORIZED=0 (dev only) | | Profile load error | Corrupt JSON in profile file | Delete with reqbench profile delete <name> and re-save |


Documentation

  • Advanced Guide -- Welch's t-test formula, interpreting p-values, sample size recommendations, percentile meanings
  • Integration Patterns -- GitHub Actions, Danger.js, custom CI scripts, JSON pipeline examples

License

MIT