npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

gpuautoscaler

v1.1.4

Published

Kubernetes-native GPU Autoscaler: maximize utilization, minimize cloud costs

Readme

GPU Autoscaler

License Go Report Card

Maximize GPU cluster utilization, minimize cloud costs

GPU Autoscaler is an open-source Kubernetes-native system that reduces GPU infrastructure costs by 30-50% through intelligent workload packing, multi-tenancy support (MIG/MPS/time-slicing), and cost-optimized autoscaling.

🎯 Problem

Organizations running GPU workloads on Kubernetes waste 40-60% of GPU capacity and overspend by millions annually due to:

  • Allocation vs. Utilization Gap: Pods request full GPUs but use <50% of resources
  • Inefficient Multi-Tenancy: Small inference jobs monopolize entire GPUs when they could share
  • Suboptimal Autoscaling: Kubernetes doesn't understand GPU-specific metrics (VRAM, SM utilization)
  • Lack of Cost Attribution: No per-team/experiment cost tracking

✨ Features

Phase 1: Observability Foundation (Available Now)

  • 📊 Real-time GPU Metrics: DCGM integration with Prometheus for GPU utilization, VRAM, temperature, power
  • 🔍 Waste Detection: Identify underutilized GPUs and workloads
  • 📈 Grafana Dashboards: Pre-built dashboards with Kubernetes pod attribution
  • 💰 Cost Visibility: Track GPU spend per namespace/team/experiment

Phase 2: Intelligent Packing (Available Now)

  • 🎯 Bin-Packing Algorithm: Automatically consolidate GPU workloads with BestFit/FirstFit/WorstFit strategies
  • 🔀 NVIDIA MIG Support: Hardware partitioning for A100/H100 GPUs with 10 profiles (1g.5gb to 7g.80gb)
  • 🔄 NVIDIA MPS Support: Process-level GPU sharing for inference workloads with configurable clients
  • ⏱️ Time-Slicing: Software-based sharing for compatible workloads with configurable replicas
  • 🎫 Admission Webhook: Zero-touch optimization requiring no workload changes - automatic strategy selection
  • 📋 GPU Sharing Policies: CRDs for cluster-wide and namespace-specific optimization policies

Phase 3: Autoscaling Engine (Available Now)

  • 🚀 GPU-Aware Autoscaling: Scale based on actual GPU utilization, not just CPU/memory
  • 💸 Spot Instance Orchestration: Prioritize cheaper spot instances with graceful eviction handling
  • 🎚️ Multi-Tier Scaling: Optimize cost with spot → on-demand → reserved instance strategy
  • 🔮 Predictive Scaling: Pre-warm nodes for known busy periods based on historical patterns
  • ☁️ Multi-Cloud Support: AWS, GCP, and Azure integrations with Auto Scaling Groups / Managed Instance Groups / VM Scale Sets

Phase 4: Cost Management (Available Now)

  • 💵 Real-time Cost Tracking: Per-second GPU cost calculation with AWS/GCP/Azure pricing APIs
  • 📊 Cost Attribution: Track spend by namespace, team, label, and experiment ID with historical data
  • 🎯 Budget Management: Set spending limits with configurable alerts (Slack, Email, PagerDuty, Webhook)
  • 🚨 Budget Enforcement: Throttle or block workloads when budgets are exceeded with grace periods
  • 💾 TimescaleDB Integration: Store historical cost data with hypertables and continuous aggregates
  • 📈 ROI Reporting: Demonstrate savings from MIG/MPS/time-slicing optimizations with detailed breakdowns
  • 🔍 Cost Drill-Down: Analyze costs by node, GPU model, workload type, and time period
  • 📊 Grafana Dashboards: Pre-built cost dashboards with hourly rate, budget burn rate, and attribution charts

🚀 Quick Start

Prerequisites

  • Kubernetes 1.25+ cluster with GPU nodes
  • NVIDIA GPU drivers (470.x+ recommended)
  • NVIDIA device plugin installed
  • Helm 3.x

Installation

# Add the GPU Autoscaler Helm repository
helm repo add gpu-autoscaler https://gpuautoscaler.github.io/charts
helm repo update

# Install GPU Autoscaler with default settings
helm install gpu-autoscaler gpu-autoscaler/gpu-autoscaler \
  --namespace gpu-autoscaler-system \
  --create-namespace

# Verify installation
kubectl get pods -n gpu-autoscaler-system

Access Dashboards

# Port-forward Grafana
kubectl port-forward -n gpu-autoscaler-system svc/gpu-autoscaler-grafana 3000:80

# Open browser to http://localhost:3000
# Default credentials: admin / (see secret)

View GPU Metrics

# Install the CLI tool
curl -sSL https://gpuautoscaler.io/install.sh | bash

# Check cluster GPU utilization
gpu-autoscaler status

# Analyze waste and get recommendations
gpu-autoscaler optimize

# View cost breakdown
gpu-autoscaler cost --namespace ml-team --last 7d

Cost Management Setup

# Enable cost tracking with cloud provider credentials
helm upgrade gpu-autoscaler gpu-autoscaler/gpu-autoscaler \
  --namespace gpu-autoscaler-system \
  --set cost.enabled=true \
  --set cost.provider=aws \
  --set cost.region=us-west-2

# Configure TimescaleDB for historical data (optional)
helm upgrade gpu-autoscaler gpu-autoscaler/gpu-autoscaler \
  --set cost.timescaledb.enabled=true \
  --set cost.timescaledb.host=timescaledb.default.svc.cluster.local \
  --set cost.timescaledb.database=gpucost

# Create a cost budget with alerts
cat <<EOF | kubectl apply -f -
apiVersion: v1alpha1.gpuautoscaler.io
kind: CostBudget
metadata:
  name: ml-team-monthly-budget
spec:
  scope:
    type: namespace
    namespaceSelector:
      matchLabels:
        team: ml-team
  budget:
    amount: 50000.0  # $50K/month
    period: 30d
  alerts:
    - threshold: 80
      channels:
        - type: slack
          config:
            webhook: https://hooks.slack.com/services/YOUR/WEBHOOK/URL
    - threshold: 100
      channels:
        - type: email
          config:
            to: [email protected]
  enforcement:
    mode: throttle  # Options: alert, throttle, block
    gracePeriod: 2h
    throttleConfig:
      maxSpotInstances: 5
      blockOnDemand: true
EOF

# Create cost attribution tracking
cat <<EOF | kubectl apply -f -
apiVersion: v1alpha1.gpuautoscaler.io
kind: CostAttribution
metadata:
  name: ml-team-attribution
spec:
  scope:
    type: namespace
    namespaceSelector:
      matchLabels:
        team: ml-team
  trackBy:
    - namespace
    - label:experiment-id
    - label:user
  reportingPeriod: 24h
EOF

# View cost reports
gpu-autoscaler cost report --format table
gpu-autoscaler cost report --format json > cost-report.json
gpu-autoscaler cost roi --last 30d

📊 Results

Organizations using GPU Autoscaler report:

  • 40-60% cost reduction through better utilization and spot instances
  • GPU utilization: 40% → 75%+ through intelligent packing
  • 2.5+ jobs per GPU (up from 1.2) through MIG/MPS sharing
  • $500K-$2M annual savings on GPU infrastructure

🏗️ Architecture

┌─────────────────────────────────────────────────────────────┐
│                   GPU Kubernetes Cluster                     │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐            │
│  │ GPU Node   │  │ GPU Node   │  │ GPU Node   │            │
│  │ + DCGM     │  │ + DCGM     │  │ + DCGM     │            │
│  │ + ML Pods  │  │ + ML Pods  │  │ + ML Pods  │            │
│  └────────────┘  └────────────┘  └────────────┘            │
└──────────────────────┬──────────────────────────────────────┘
                       │ (metrics)
                       ▼
┌─────────────────────────────────────────────────────────────┐
│              GPU Autoscaler Control Plane                    │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │ Prometheus   │→ │  Controller  │→ │  Admission   │      │
│  │  (Metrics)   │  │  (Packing)   │  │   Webhook    │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
│         │                  │                  │             │
│         ▼                  ▼                  ▼             │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │  Grafana     │  │  Autoscaler  │  │    Cost      │      │
│  │ (Dashboards) │  │   Engine     │  │  Calculator  │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
└─────────────────────────────────────────────────────────────┘

📖 Documentation

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Development Setup

# Clone the repository
git clone https://github.com/gpuautoscaler/gpuautoscaler.git
cd gpuautoscaler

# Install dependencies
make install-deps

# Run tests
make test

# Build controller
make build

# Run locally (requires Kind with GPU support)
make dev-cluster

📦 Release Guide

GPU Autoscaler uses automated semantic versioning with svu and GoReleaser. Releases are automatically created when commits are pushed to the main branch.

Commit Message Convention

Follow Conventional Commits to trigger the correct version bump:

Patch Release (v1.0.x) - Bug fixes and minor changes:

fix: resolve memory leak in cost tracker
fix(api): handle nil pointer in budget controller

Minor Release (v1.x.0) - New features (backward compatible):

feat: add support for Azure GPU pricing
feat(cost): implement custom alert templates

Major Release (vx.0.0) - Breaking changes:

feat!: redesign CostBudget API with new fields
fix!: change default enforcement mode to throttle

BREAKING CHANGE: CostBudget.spec.limit renamed to CostBudget.spec.budget.amount

Release Process

  1. Merge PR to main: All commits to main trigger the release workflow
  2. Automatic tagging: GitHub Actions calculates the next version using svu based on commit messages
  3. Build & publish: GoReleaser creates:
    • GitHub Release with binaries (Linux, macOS, Windows)
    • Changelog from commit messages
    • Checksums for all artifacts

Manual Release (if needed)

# Create and push a tag manually
git tag v1.0.5
git push origin v1.0.5

# The release workflow will trigger automatically

Release Assets

Each release includes:

  • gpu-autoscaler-controller - Kubernetes controller binary (Linux/macOS, amd64/arm64)
  • gpu-autoscaler - CLI tool (Linux/macOS/Windows, amd64/arm64)
  • Source code archives
  • SHA256 checksums

📝 License

Apache License 2.0 - see LICENSE file for details.

🙏 Acknowledgments

  • NVIDIA for DCGM and GPU technologies (MIG, MPS)
  • Kubernetes community for device plugin and scheduler frameworks
  • CNCF projects: Prometheus, Grafana, Karpenter

🔗 Links

📊 Project Status

Current Phase: Phase 4 - Cost Management ✅

All four phases are now complete:

  • ✅ Phase 1: Observability Foundation
  • ✅ Phase 2: Intelligent Packing
  • ✅ Phase 3: Autoscaling Engine
  • ✅ Phase 4: Cost Management

See our roadmap for future enhancements and planned features.


Made with ❤️ by the GPU Autoscaler community