badgr-cli

v1.0.46

Published

2 days ago

Badgr — run or serve GPU workloads from one command

0High
0Medium
0Low

michaelmanley

gpu cli ai compute gateway openai workload workspace llm inference fine-tuning

badgr-cli

Badgr supports many GPU workloads through two commands: serve for persistent endpoints, run for jobs.

Safety promise: every run has --max-cost, live logs, automatic teardown, and a receipt. Run badgr down <id> any time to stop billing immediately.

npm install -g badgr-cli

Quick start

# 1. Authenticate once
badgr login

# 2. Verify the stack end-to-end
badgr test

# 3. Run a project folder on a GPU
badgr run . --cmd "python train.py" --max-cost 5

# 4. Serve an OpenAI-compatible inference endpoint
badgr serve meta-llama/Llama-3.1-8B-Instruct --max-cost 10

# 5. Stop billing
badgr down <deployment-id>

# 6. View cost, route, and retry receipts
badgr receipts

badgr serve prints a URL you can point any OpenAI SDK client at — see OpenAI compatibility.

Also try: image generation

badgr comfyui batch --workflow sdxl-basic --prompt "a cat on a beach" --max-cost 5

Runs a blessed ComfyUI workflow, no setup, and prints image URLs when done. No manual teardown needed — it stops itself. Details in badgr comfyui batch options.

Commands

login
run
serve
status
logs
down
receipts
test

| Command | What it does | |---------|-------------| | badgr login | Save API key to ~/.badgr/config.json | | badgr run <command> | Run a one-off GPU job (any container command) | | badgr serve <model> | Start a persistent OpenAI-compatible endpoint | | badgr status | Show what's running and what's billing | | badgr logs <id> | Fetch log output from a deployment | | badgr down <id> | Terminate a deployment — stops billing immediately | | badgr receipts [n] | Cost, route, and retry receipts (default 10) | | badgr test | Run an end-to-end test (provision → run → teardown) |

More commands below, under Advanced: comfyui, train, transcribe, embed, workload, workspace, capacity, billing.

badgr serve — for anything that needs a persistent endpoint: LLM serving, embeddings, image generation APIs, transcription APIs.

badgr run — for anything that starts, runs, and exits: batch inference, fine-tuning, evals, image/video batch jobs, audio processing.

Common flags

These appear on most commands (run, serve, comfyui, train, transcribe, embed) — documented once here instead of repeated in every table below.

| Flag | Default | Description | |------|---------|-------------| | --gpu <type> | auto | GPU type override — see GPU options | | --tier 1\|2 | 1 | 1 = reliable managed routing (default); 2 = lower-cost marketplace routing | | --region US\|EU\|AU | — | Region preference. If omitted, Badgr chooses best available capacity | | --max-price <$/hr> | — | Hard spend cap per GPU-hour | | --count <n> | 1 | Number of GPUs | | --env KEY=VALUE | — | Environment variable (repeatable) | | --max-cost <$> | — | Auto-stop when total spend reaches this amount — required for run, comfyui run, comfyui batch, train lora | | --detach | off | Launch and return immediately, don't stream logs (serve uses --no-wait instead — see below) |

Each command section below lists only its own extra flags.

`badgr serve` options

badgr serve meta-llama/Llama-3.1-8B-Instruct --gpu L40S --region EU

| Flag | Default | Description | |------|---------|-------------| | --image <img> | — | Serve a custom container instead of a HuggingFace model | | --task <task> | — | vLLM task override, e.g. embed for embedding models | | --health-path <path> | auto | Readiness path to poll (auto-detected for ComfyUI → /system_stats) | | --no-wait | off | Skip endpoint health check and return immediately | | --list-aliases | — | List blessed vLLM model aliases (qwen-7b, llama-8b, qwen-coder-7b) and exit — no provisioning, no API key required |

Blessed aliases expand to a full model ID + preset GPU, e.g. badgr serve qwen-7b → Qwen/Qwen2.5-7B-Instruct on an RTX 4090. Run badgr serve --list-aliases to see the current list.

Model support levels

badgr serve qwen-7b is the happy path — a tested route with no extra setup. badgr serve also accepts any other model ID or a custom container:

| Level | What it means | |-------|---------------| | Tested route (badgr serve qwen-7b) | One of the aliases above — tested and officially supported. | | Best-effort Hugging Face model (badgr serve <org>/<model>) | Any other Hugging Face model ID. Badgr will try a compatible route — not a guarantee every model works. | | Custom container (badgr serve --image ...) | You own the server behavior; Badgr manages runtime, logs, spend caps, teardown, and the receipt. |

Gated Hugging Face models (e.g. Llama, Gemma) may need --env HF_TOKEN=$HF_TOKEN. Badgr only prints that hint if the deployment actually fails to start.

`badgr run` options

Three source patterns:

# Flow 1 — local project folder (primary)
badgr run . --cmd "python train.py" --max-cost 5

# Flow 2 — public GitHub repo
badgr run https://github.com/user/repo --cmd "python train.py" --max-cost 5

# Flow 3 — custom Docker image (advanced)
badgr run . --image mycompany/custom:latest --cmd "python train.py" --max-cost 5

Badgr zips and uploads the folder (Flow 1) or clones the repo (Flow 2), picks a generic runner, installs deps, runs the command, stores outputs for 48 hours, and tears down the GPU. --max-cost is required.

| Flag | Default | Description | |------|---------|-------------| | --cmd <command> | — | Command to run inside the uploaded project or cloned repo (required for folder/GitHub flows) | | --min-vram <GB> | — | Minimum VRAM in GB — optional constraint for Auto routing | | --image <img> | — | Custom Docker image — bypasses the runner | | --max-runtime <min> | — | Auto-stop after N minutes | | --save <name> | — | Save this job as a named workload after it completes | | --workspace <name-or-id> | — | Link this run to a workspace (name or ws_ ID) | | --output <path> | — | See "Crash recovery" below | | --checkpoint <path> | — | See "Crash recovery" below | | --retry-safe | off | See "Crash recovery" below | | --resume-cmd "<cmd>" | — | See "Crash recovery" below |

Crash recovery is a convention, not a feature Badgr runs for you. Badgr just passes these through into the container as environment variables — your script has to read them and actually write the files:

| Flag | Container sees | Your script needs to | |------|-----------------|-----------------------| | --output <path> | BADGR_OUTPUT_DIR=<path> | Write outputs it wants preserved to that path | | --checkpoint <path> | BADGR_CHECKPOINT_DIR=<path> | Write/read resumable checkpoints at that path | | --retry-safe | BADGR_RETRY_SAFE=1 | Confirm it's safe to re-run from scratch (e.g. it checkpoints/dedupes internally) | | --resume-cmd "<cmd>" | (not passed to the container) | Nothing — Badgr just prints this command back to you on failure, so you have it without digging through shell history |

On failure, badgr run / badgr serve / badgr comfyui run all print a Class:/Next: pair identifying what went wrong (e.g. out_of_memory, image_pull_failed, cuda_unavailable) and a suggested next step.

`badgr comfyui run` options

badgr comfyui run workflow.json --max-cost 10
badgr comfyui run workflow.json --gpu RTX_4090 --check-nodes KSampler,CLIPTextEncode

Requires either --max-cost or --persistent to prevent runaway billing.

| Flag | Default | Description | |------|---------|-------------| | --check-nodes <n1,n2> | — | Verify custom nodes are installed after startup | | --no-wait | off | Skip health check, return immediately | | --persistent | off | Run until manually stopped (no spending cap) | | --yes / -y | off | Skip duplicate-deployment warning |

`badgr comfyui batch` options

Productized batch image generation — runs a list of prompts through a blessed ComfyUI workflow and returns image URLs. No ComfyUI setup, no workflow file, no manual teardown.

badgr comfyui batch --workflow sdxl-basic --prompts prompts.txt --max-cost 10
badgr comfyui batch --workflow sdxl-basic --prompt "a cat on a beach" --prompt "a dog in the park" --max-cost 5

Blessed workflows: sdxl-basic (SDXL text-to-image, default sampler settings). Max 20 prompts per batch.

| Flag | Default | Description | |------|---------|-------------| | --workflow <name> | — | Blessed workflow ID (required) — currently sdxl-basic | | --prompts <file> | — | Text file, one prompt per line | | --prompt <text> | — | Inline prompt (repeatable) — combine with --prompts if needed | | --max-runtime <min> | 60 | Auto-stop after N minutes | | --gpu-type <type> | workflow default | GPU type override | | --dry-run | — | Preview the batch (workflow, GPU, prompt count, cost) without provisioning |

Polls until complete and prints image URLs, or detaches with badgr status guidance if it outlives --max-runtime.

Receipts

Every badgr serve and badgr run action generates a receipt:

badgr receipts        # last 10
badgr receipts 50     # last 50

Each receipt includes runtime, estimated/settled cost, status, retries, teardown/billing result, and job/deployment ID.

OpenAI compatibility

badgr serve provisions a vLLM endpoint that is fully OpenAI-compatible:

import os
from openai import OpenAI

# Export BADGR_ENDPOINT from the URL printed by `badgr serve`
client = OpenAI(
    api_key=os.environ["BADGR_API_KEY"],
    base_url=os.environ["BADGR_ENDPOINT"],
)
resp = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[{"role": "user", "content": "Hello"}],
)

import OpenAI from "openai";
const client = new OpenAI({
  apiKey: process.env.BADGR_API_KEY,
  baseURL: process.env.BADGR_ENDPOINT,  // URL printed by `badgr serve`
});

GPU options

Badgr Auto selects the best eligible GPU for your workload. Add --gpu <type> or --min-vram <GB> only when you need more control.

| Flag value | GPU | VRAM | Best for | |-----------|-----|------|---------| | RTX_3090 | NVIDIA RTX 3090 | 24 GB | Dev, inference | | RTX_4090 | NVIDIA RTX 4090 | 24 GB | Inference, training, dev | | L40S | NVIDIA L40S | 48 GB | Inference, vLLM, embeddings | | A100 | NVIDIA A100 | 40–80 GB | Training, inference | | H100 | NVIDIA H100 | 80 GB | Large model training |

Additional GPU types may be routable depending on current capacity — check with badgr capacity. Pricing is confirmed before provisioning; use --dry-run to see it first. Full GPU support details: see NOTES.md in the repo root.

Routing

--tier 1 (default) uses managed provider routing — reliable, consistent performance. --tier 2 uses marketplace routing for lower-cost options. Most users should stick with the default.

Preview any command before provisioning with --dry-run, e.g.:

badgr serve meta-llama/Llama-3.1-8B-Instruct --dry-run

Advanced

Less common commands — training, transcription, embeddings, and the workload/workspace trackers. Same flags as run/serve unless noted (see Common flags).

`badgr train` / `badgr train lora`

badgr train config.yaml --gpu A100 --max-runtime 240 --env HF_TOKEN=$HF_TOKEN

Detects framework (axolotl, unsloth, trl) from the config file, but only Axolotl configs run today — unsloth/trl/unrecognized configs are blocked before provisioning rather than billing a GPU that's guaranteed to fail. Default max-runtime is 120 min.

badgr train lora --base-model mistralai/Mistral-7B-v0.1 --dataset ./train.jsonl --preset small --max-cost 20

Productized LoRA training — pass a base model and dataset, no Axolotl config file needed. Badgr generates the config from a preset and returns a downloadable adapter.

| Flag | Default | Description | |------|---------|-------------| | --framework <name> | auto-detect | Force framework: axolotl, unsloth, trl (only axolotl currently runs) | | --base-model <id> | — | HuggingFace model ID (required for train lora) — validated to exist before provisioning | | --dataset <path\|url> | — | Local file, direct URL, or s3:// URI | | --file-id <id> | — | Badgr upload ID instead of --dataset | | --preset small\|medium | small | small = RTX 4090, rank 16, 3 epochs. medium = A100, rank 32, 5 epochs | | --gpu-type <type> | preset default | GPU type override for train lora | | --dry-run | — | Preview the job without provisioning |

On completion, train lora prints an adapter_url — download with GET /v1/jobs/{job_id}/adapter, or via badgr workload info if saved.

`badgr transcribe`

badgr transcribe recording.mp3 --max-cost 2

Whisper transcription. Accepts a public URL, S3/GCS URI, or a local file under 50 MB. Default max-runtime is 30 min.

| Flag | Default | Description | |------|---------|-------------| | --model <name> | large-v3 | Whisper model | | --language <code> | — | Language hint (e.g. en, fr) | | --output <format> | — | Output format: txt, srt, vtt |

`badgr embed`

badgr embed BAAI/bge-large-en-v1.5 documents.txt --max-cost 2

Text embeddings via vLLM. Accepts a public URL, S3/GCS URI, or a local text file under 10 MB. Outputs JSONL ({"text": ..., "embedding": [...]}). Default max-runtime is 30 min.

| Flag | Default | Description | |------|---------|-------------| | --batch-size <n> | — | Embedding batch size |

Workloads

A workload is a saved job configuration, created with --save <name> on badgr run. Rerun by name instead of retyping all the flags; Badgr tracks success rate, average cost, and the last known-good route.

badgr run . --cmd "python train.py" --max-cost 10 --save my-training-job
badgr workload run my-training-job
badgr workload list
badgr workload info my-training-job
badgr workload delete my-training-job

| Subcommand | Description | |------------|-------------| | list [n] | List saved workloads | | info <name> | Stats, route history, recent jobs | | run <name> | Submit a new job using the workload's saved config (--max-cost, --max-runtime, --set KEY=VALUE to override) | | delete <name> | Delete the workload record |

Workspaces

Most users never need this — badgr run . handles upload, caching, and artifact storage automatically. Workspaces are for grouping jobs under a named cost/context bucket, optionally linked to S3/GCS storage.

badgr workspace create my-project --storage s3://my-bucket/runs --desc "nightly evals"
badgr run . --cmd "python eval.py" --workspace my-project --max-cost 5
badgr workspace info my-project
badgr workspace list
badgr workspace delete my-project

Other commands

badgr capacity [--gpu <type>] — check available GPU capacity right now
badgr billing status / badgr billing add <amount> — check balance / add funds

Requirements

Node.js 18+
A Badgr account — sign up at aibadgr.com

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

badgr-cli

Quick start

Also try: image generation

Commands

Common flags

badgr serve options

Model support levels

badgr run options

badgr comfyui run options

badgr comfyui batch options

Receipts

OpenAI compatibility

GPU options

Routing

Advanced

badgr train / badgr train lora

badgr transcribe

badgr embed

Workloads

Workspaces

Other commands

Requirements

`badgr serve` options

`badgr run` options

`badgr comfyui run` options

`badgr comfyui batch` options

`badgr train` / `badgr train lora`

`badgr transcribe`

`badgr embed`