badgr-cli
v1.0.19
Published
Badgr, run or serve GPU workloads from one command
Downloads
386
Maintainers
Readme
badgr-cli
Badgr supports many GPU workloads through two commands: serve for persistent endpoints, run for jobs.
npm install -g badgr-cliQuick start
# 1. Authenticate once
badgr login
# 2. Serve an OpenAI-compatible inference endpoint
badgr serve meta-llama/Llama-3.1-8B-Instruct --gpu L40S
# 3. Use the endpoint with any OpenAI SDK client
# client = OpenAI(api_key="sk-...", base_url="https://dep-a1b2c3.api.badgr.ai/v1")
# 4. View cost, route, and retry receipts
badgr receipts
# 5. Stop billing
badgr down <deployment-id>Commands
| Command | What it does |
|---------|-------------|
| badgr login | Save API key to ~/.badgr/config.json |
| badgr serve <model> | Start a persistent OpenAI-compatible endpoint |
| badgr run <command> | Run a one-off GPU job (any container command) |
| badgr down <id> | Terminate a deployment — stops billing immediately |
| badgr logs <id> | Fetch log output from a deployment |
| badgr receipts [n] | Cost, route, and retry receipts (default 10) |
badgr serve — for anything that needs a persistent endpoint: LLM serving, embeddings, image generation APIs, transcription APIs.
badgr run — for anything that starts, runs, and exits: batch inference, fine-tuning, evals, image/video batch jobs, audio processing.
badgr serve options
badgr serve meta-llama/Llama-3.1-8B-Instruct --gpu L40S --region EU| Flag | Default | Description |
|------|---------|-------------|
| --gpu <type> | RTX_4090 | GPU: RTX_4090, L40S, A6000, A100, H100 |
| --region <region> | — | Optional region preference. If omitted, Badgr chooses best available capacity. |
| --count <n> | 1 | Number of GPUs (1–8) |
| --max-price <$/hr> | — | Hard spend cap per GPU-hour |
| --dry-run | — | Preview routing without provisioning |
badgr run options
badgr run python train.py --gpu A100 --env HF_TOKEN=$HF_TOKEN| Flag | Default | Description |
|------|---------|-------------|
| --gpu <type> | RTX_4090 | GPU type |
| --image <img> | python:3.11-slim | Docker image |
| --env KEY=VALUE | — | Environment variable (repeatable) |
| --region <region> | — | Optional region preference. If omitted, Badgr chooses best available capacity. |
| --max-price <$/hr> | — | Hard spend cap per GPU-hour |
| --detach | — | Launch and return immediately |
Routing
Badgr automatically searches available GPU capacity across its verified compute network.
Badgr automatically searches verified GPU capacity and chooses the best eligible route for your GPU type, workload, price cap, and optional region preference.
Preview before provisioning:
badgr serve mistral-7b --gpu RTX_4090 --dry-runReceipts
Every badgr serve and badgr run action generates a receipt:
badgr receipts # last 10
badgr receipts 50 # last 50Each receipt includes: receipt ID, GPU type, provisioning latency, price/hr, retry count, and status.
OpenAI compatibility
badgr serve provisions a vLLM endpoint that is fully OpenAI-compatible:
from openai import OpenAI
client = OpenAI(
api_key="your-badgr-api-key",
base_url="https://dep-a1b2c3.api.badgr.ai/v1", # from badgr serve output
)
resp = client.chat.completions.create(
model="meta-llama/Llama-3.1-8B-Instruct",
messages=[{"role": "user", "content": "Hello"}],
)import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.BADGR_API_KEY,
baseURL: "https://dep-a1b2c3.api.badgr.ai/v1",
});GPU options
| Flag value | GPU | VRAM | Estimated Badgr price/hr | |-----------|-----|------|-------------| | RTX_4090 | NVIDIA RTX 4090 | 24 GB | $0.65–0.89 | | L40S | NVIDIA L40S | 48 GB | $1.10–1.40 | | A6000 | NVIDIA RTX A6000 | 48 GB | $1.05–1.35 | | A100 | NVIDIA A100 | 80 GB | $1.20–1.50 | | H100 | NVIDIA H100 | 80 GB | $2.80–3.10 |
Prices shown are estimated Badgr rates. Final $/GPU-hour is confirmed before provisioning and may vary by GPU type, availability, region, workload, and runtime.
Requirements
- Node.js 18+
- A Badgr account — sign up at badgr.ai
