@c0mpute/worker
v2.8.2
Published
Native CLI worker for the c0mpute.ai distributed inference network. Runs LLM inference via ollama and connects to the orchestrator via Socket.io.
Maintainers
Readme
c0mpute-worker
Native CLI worker for the c0mpute.ai distributed inference network. Connects to the orchestrator over Socket.io and serves jobs from your GPU. A worker runs in one of two modes — text or image — chosen on first run:
- Max (text) — LLM inference via ollama. Pick a model: Qwen3.5 27B or SuperGemma4 26B (both uncensored).
- Image — text-to-image via ComfyUI + the uncensored Chroma1-HD model.
Quick Start
npx @c0mpute/worker --token <your-token>It asks which mode to run (Max or Image) on every interactive start, defaulting to your last choice — just press Enter to keep it, or pick the other to switch. Skip the prompt entirely with --mode:
npx @c0mpute/worker --token <your-token> --mode max # text worker
npx @c0mpute/worker --token <your-token> --mode image # image workerFor a Max worker it then asks which model to run (again every interactive start, defaulting to your last choice) and shows how many workers are live on each, recommending the one with the fewest (so new supply balances the network). Skip that prompt with --model:
npx @c0mpute/worker --token <your-token> --mode max --model qwen # Qwen3.5 27B
npx @c0mpute/worker --token <your-token> --mode max --model supergemma # SuperGemma4 26BGet a token at c0mpute.ai/earn. Only the chosen mode + model is downloaded — never more than one.
Max (text) worker
Runs your chosen model via ollama: Qwen3.5 27B (tools, vision, thinking) or SuperGemma4 26B (MoE, newer, faster, tools — text only). On first run it automatically installs ollama if it's missing (winget on Windows, Homebrew on macOS, the official script on Linux), starts/configures it (flash-attention + q8 KV cache on NVIDIA for ~36% more speed), pulls the model (~17GB), tunes a VRAM-adaptive context window (24GB → 32K, 48GB+ → 64K), runs a speed benchmark, and serves jobs (streaming + tool calling, plus vision/thinking on models that support them). Every interactive start re-asks your model with the last one as default; press Enter to keep it or pass --model to set it directly.
Supervise ollama yourself? Set
C0MPUTE_MANAGE_OLLAMA=0to use your running instance.
Requirements: Node 18+, ollama, 20GB+ VRAM (RTX 3090/4090, Apple Silicon 32GB+), ~17GB disk.
Image worker
Runs the uncensored Chroma1-HD model on ComfyUI and renders the jobs the orchestrator dispatches. The worker is a thin relay: the orchestrator sends the full workflow (model + tuned defaults), so every worker produces identical output and the recipe can change without you updating anything.
On startup it:
- Checks ComfyUI is reachable (
COMFY_URL, defaulthttp://127.0.0.1:8188) and starts it ifCOMFY_DIRis set. - Downloads the Chroma model files (~14GB, first run only) if they're missing.
- Runs a render self-check — a quick 512×512 test image — and only registers if it succeeds, so a broken setup never accepts jobs.
- Serves render jobs and earns per image.
Requirements: Node 18+, ComfyUI (point COMFY_URL at it, or set COMFY_DIR so the worker can launch it), a 24GB GPU (RTX 3090/4090) recommended, ~14GB disk for the model.
Env: COMFY_URL (ComfyUI endpoint), COMFY_DIR (ComfyUI folder, lets the worker install/launch it + place models).
Options
--token <token> Authentication token from c0mpute.ai (required)
--mode <mode> "max" (text) or "image". Prompts on first run if omitted.
--model <model> Max model: "qwen" or "supergemma". Prompts on first run if omitted.
--url <url> Orchestrator URL (default: https://c0mpute.ai)
--benchmark Run benchmark only, then exit (Max mode)
--version Show version
--help Show helpEarnings
Workers earn credits for completing jobs — text jobs by tier and tokens generated, image jobs per render. Check your earnings at c0mpute.ai.
