npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@auxot/worker-cli

v0.3.5

Published

Auxot GPU worker CLI - connects local GPU resources to Auxot platform

Readme

Auxot GPU Worker CLI

Connect your local GPU resources to the Auxot platform. The worker CLI automatically downloads and manages llama.cpp binaries and models based on your GPU key policy.

Requirements

  • Node.js 20+
  • Valid Auxot GPU key with configured policy
  • GPU hardware (NVIDIA CUDA, AMD Vulkan, or Apple Metal) - CPU fallback available with limitations

Installation

# Run directly with npx (recommended)
npx @auxot/worker-cli --gpu-key YOUR_GPU_KEY

# Or install globally
npm install -g @auxot/worker-cli
worker-cli --gpu-key YOUR_GPU_KEY

Quick Start

1. Get Your GPU Key

  1. Log in to Auxot
  2. Navigate to Organization Settings → GPU Keys
  3. Create a new GPU key
  4. Configure the policy (model, quantization, context size, capabilities)
  5. Copy the key (format: gpu.xxxxx.yyyyy)

2. Run the Worker

npx @auxot/worker-cli --gpu-key gpu.xxxxx.yyyyy

The worker CLI will:

  • ✅ Automatically download the llama.cpp binary (first run only)
  • ✅ Automatically download the model specified in your GPU key policy
  • ✅ Spawn and manage the llama.cpp process
  • ✅ Connect to the Auxot platform
  • ✅ Start processing jobs

CLI Options

--gpu-key <key>        GPU authentication key (required)
--auxot-url <url>      Auxot platform URL (default: https://auxot.com)
--debug [level]        Enable debug logging (level 1 or 2, default: 1)
--help, -h             Show help message

Note: The --llama-url option is no longer needed. The worker CLI manages its own llama.cpp instance on http://127.0.0.1:9002.

How It Works

  1. Policy Reception: Worker CLI connects to Auxot and receives the GPU key policy (model, quantization, context size, etc.)
  2. Model Download: Automatically downloads the required GGUF model from Hugging Face if not already cached
  3. Binary Download: Downloads the appropriate llama.cpp binary for your platform (first run only)
  4. Process Management: Spawns llama.cpp with policy parameters and manages its lifecycle
  5. Capability Discovery: Queries llama.cpp to discover actual model capabilities
  6. Validation: Validates discovered capabilities against the policy (both client and server-side)
  7. Job Processing: Listens for agent execution jobs and forwards them to llama.cpp
  8. Streaming: Streams response tokens back to the platform in real-time
  9. Crash Recovery: Automatically restarts llama.cpp if it crashes

GPU Key Policy

The GPU key policy defines what model and configuration your worker must use:

  • Model Name: Which model to load (e.g., "Qwen3-VL-30B-A3B")
  • Quantization: Model quantization level (e.g., "Q4_K_S", "F16")
  • Context Size: Maximum context window (e.g., 128000)
  • Max Parallelism: Maximum parallel jobs (e.g., 2)
  • Capabilities: Required capabilities (e.g., ["chat", "vision"])

The worker CLI validates that its discovered capabilities match the policy before accepting jobs.

Model Storage

Models are cached in ~/.auxot/models/ (platform-specific). You can override this with the AUXOT_MODELS_DIR environment variable.

Binary Storage

The llama.cpp binary is cached in ~/.auxot/llama-server/{platform}-{arch}/. The binary is downloaded once and reused on subsequent runs.

GPU Detection

The worker CLI automatically detects your GPU hardware:

  • macOS: Metal GPU acceleration (built into binaries)
  • Linux: Vulkan GPU acceleration (AMD/NVIDIA) or CPU fallback
  • Windows: CUDA 12.4 GPU acceleration (NVIDIA) or CPU fallback

If no GPU is detected, the worker will:

  • Download a CPU-only binary with a warning
  • Limit model size to 7B or less (if policy specifies larger model, validation will fail)

GPU ID

The worker CLI generates a stable UUID on first run and stores it in ~/.auxot/gpu-id. This allows Auxot to track individual GPUs across restarts.

Troubleshooting

Connection Failed

  • Verify --auxot-url is correct
  • Check network connectivity
  • Ensure GPU key is valid and has a configured policy

Policy Validation Failed

  • Ensure your GPU key policy is configured in the web UI
  • Check that the model specified in the policy exists in the model registry
  • Verify your GPU hardware meets the policy requirements (context size, capabilities)

Model Download Failed

  • Check internet connectivity (models are downloaded from Hugging Face)
  • Verify sufficient disk space (models can be 10GB+)
  • Check Hugging Face API rate limits (downloads are resumable)

llama.cpp Crashes

  • Check GPU memory (VRAM) - models may not fit in available memory
  • Review llama.cpp logs in worker CLI output
  • Worker CLI will attempt to auto-restart crashed processes

No Jobs Received

  • Verify GPU key belongs to the correct organization
  • Check that agents exist in your organization
  • Ensure GPU meets minimum context size requirements
  • Verify worker is online (check dashboard in web UI)

Support

For issues, questions, or feature requests, please visit:

License

Copyright © 2026 Auxot. All rights reserved.