npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

free-llm-api-provider

v1.0.0

Published

Local LLM proxy with automatic failover between 25+ free AI providers. Self-contained — no external dependencies.

Readme

free-llm-api-provider

A self-contained local LLM proxy with automatic failover across 25+ free AI providers and 238 models. No external dependencies — install once, configure keys, forget about it.

Stop managing API keys. Stop worrying about rate limits. Stop paying for AI.

Why?

Free AI APIs break — rate limits, downtime, capacity issues. Your coding assistant dies mid-session.

free-llm-api-provider runs a local OpenAI-compatible proxy that automatically routes to the next available provider when one fails. Zero config after setup.

What It Does

  • Health-aware routing: Real-time ping monitoring routes to the healthiest provider instantly — no blind iteration
  • Auto-failover: 429/500/timeout → switch provider automatically
  • Sticky provider: Once a provider works, it keeps using it until it fails (faster response)
  • Multi-key support: Add multiple keys for the same provider, tries them all before failing over
  • 238 models across 25 providers (NVIDIA, Groq, OpenRouter, Cerebras, etc.)
  • Tier-based routing: tier-splus (elite) → tier-b (default), with health scores overriding tiers
  • Real-time status dashboard: --status shows live provider health, latency, and quota
  • 10s reliability analysis: --fiable finds the most stable provider right now
  • OpenAI-compatible: Works with Cursor, VS Code, Claude Desktop, OpenCode, any client
  • Self-contained: Built-in config manager + model catalog + health checker. No separate tools needed.
  • Zero dependencies: Pure Node.js. No Python, no Docker required.
  • Core replicated from free-coding-models: Uses the same ping/quota/extraction logic

Supported Providers

| # | Provider | Models | Free Tier | Env Var | |---|----------|--------|-----------|---------| | 1 | NVIDIA NIM | 46 | ~40 RPM (no CC) | NVIDIA_API_KEY | | 2 | Groq | 8 | 30 RPM, 1K-14.4K/day | GROQ_API_KEY | | 3 | Cerebras | 4 | 30 RPM, 1M tokens/day | CEREBRAS_API_KEY | | 4 | OpenRouter | 25 | 50/day free, 1K/day with $10 | OPENROUTER_API_KEY | | 5 | SambaNova | 13 | Dev tier generous | SAMBANOVA_API_KEY | | 6 | Hyperbolic | 13 | $1 free credits | HYPERBOLIC_API_KEY | | 7 | Cloudflare | 15 | 10K neurons/day | CLOUDFLARE_API_TOKEN | | 8 | Google AI Studio | 6 | 14.4K/day | GOOGLE_API_KEY | | 9 | ZAI | 7 | Generous quota | ZAI_API_KEY | | 10 | Scaleway | 10 | 1M free tokens | SCALEWAY_API_KEY | | 11 | SiliconFlow | 6 | 100/day + $1 credits | SILICONFLOW_API_KEY | | 12 | + 14 more | | | |

Quick Start

1. Install

npm install -g free-llm-api-provider

2. Configure API Keys (One-time)

free-llm-api-provider --config

Interactive wizard prompts for keys. Need only one to start. More = better failover.

Recommended first key: Groq — https://console.groq.com/keys (30 RPM, no credit card)

You can also set keys via environment variables:

export GROQ_API_KEY="your_key_here"
export NVIDIA_API_KEY="your_key_here"
export OPENROUTER_API_KEY="your_key_here"

Multi-key support: You can add multiple keys for the same provider:

# In the config wizard, add a key, then add another for the same provider
# The proxy will try all keys before failing over to the next provider

Where are keys stored?

  • Config file: ~/.free-llm-api-provider.json (outside project directory)
  • Or via environment variables (env vars override config file)

3. Start Proxy

free-llm-api-provider

Proxy runs at http://localhost:4000.

4. Configure Your AI Client

| Client | Base URL | API Key | |--------|----------|---------| | Cursor | http://localhost:4000/v1 | sk-free-llm-api-provider | | VS Code | http://localhost:4000/v1 | sk-free-llm-api-provider | | Claude Desktop | http://localhost:4000/v1 | sk-free-llm-api-provider | | OpenCode | http://localhost:4000/v1 | sk-free-llm-api-provider |

CLI Commands

All commands work with or without --:

# Start proxy (default)
free-llm-api-provider
free-llm-api-provider start

# Interactive config wizard
free-llm-api-provider --config
free-llm-api-provider config

# Show current config
free-llm-api-provider --show
free-llm-api-provider show

# Real-time provider health dashboard (live updating)
free-llm-api-provider --status
free-llm-api-provider status

# 10-second reliability analysis (finds best provider right now)
free-llm-api-provider --fiable
free-llm-api-provider fiable

# List all 238 models
free-llm-api-provider --models
free-llm-api-provider models

# List S+ tier only
free-llm-api-provider --models --tier S+
free-llm-api-provider models --tier S+

# List models for specific provider
free-llm-api-provider --models --provider groq
free-llm-api-provider models --provider groq

# Stop / restart proxy
free-llm-api-provider --stop
free-llm-api-provider stop
free-llm-api-provider --restart
free-llm-api-provider restart

# View proxy logs
free-llm-api-provider --logs
free-llm-api-provider logs

# Test proxy health
free-llm-api-provider --test
free-llm-api-provider test

Shortcuts with flap alias

After installation, you can also use the shorter flap command:

flap status          # Same as free-llm-api-provider --status
flap test            # Same as free-llm-api-provider --test
flap stop            # Same as free-llm-api-provider stop
flap restart         # Same as free-llm-api-provider restart
flap show            # Same as free-llm-api-provider --show
flap models          # Same as free-llm-api-provider --models
flap fiable          # Same as free-llm-api-provider --fiable
flap logs            # Same as free-llm-api-provider --logs
flap config          # Same as free-llm-api-provider --config

API Usage

Python

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:4000/v1",
    api_key="sk-free-llm-api-provider"
)

response = client.chat.completions.create(
    model="tier-splus",  # or tier-s, tier-aplus, tier-a, tier-b
    messages=[{"role": "user", "content": "Write hello world in Rust"}]
)
print(response.choices[0].message.content)

cURL

curl http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-free-llm-api-provider" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tier-splus",
    "messages": [{"role": "user", "content": "Hi"}]
  }'

How Failover Works

Health-Aware Routing (New)

The proxy continuously pings all providers in the background. Each request is routed to the healthiest provider first — no blind iteration.

Provider Health Scores (updated every 30s):
  Groq      → Score: 95, Latency: 198ms ✅
  Cerebras  → Score: 80, Latency: 299ms
  OpenRouter→ Score: 45, Latency: 1200ms (slow)

Request → Groq (healthiest) → SUCCESS ✓

If the healthiest fails, it tries the next in score order. If no health data exists yet, falls back to tier-based ordering.

Traditional Failover (Fallback)

Request → NVIDIA → 429 rate limit
        → Groq   → 500 error
        → Cerebras → SUCCESS ✓

Sticky provider optimization:

Request 1: Try providers until one works → OpenRouter ✅
Request 2+: Use OpenRouter directly (fast!)
Request N: OpenRouter fails → Try next provider → NVIDIA ✅

Multi-key failover:

Groq key 1 → 401 ❌
Groq key 2 → 429 ❌
Groq key 3 → 200 ✅  ← Stays here until it fails

Proxy handles retries + provider switching. Your client sees only success or final exhaust error.

Model Tiers

Tiers based on SWE-bench scores (coding benchmark), replicated from free-coding-models:

| Tier | Score | Description | |------|-------|-------------| | S+ | 70%+ | Frontier models. Best for complex refactors, architecture decisions | | S | 60-70% | Excellent coding models. Reliable for most tasks | | A+ | 50-60% | Very capable. Great alternatives to frontier models | | A | 40-50% | Solid performers. Good for general coding | | A- | 35-40% | Decent. Usable for simpler tasks | | B+ | 30-35% | Capable for smaller tasks and quick scripts | | B | 20-30% | Entry-level. Default fallback tier | | C | <20% | Basic models. Last resort only |

Use tier aliases in requests: tier-splus, tier-s, tier-aplus, tier-a, tier-aminus, tier-bplus, tier-b

Config File

Stored at ~/.free-llm-api-provider.json:

{
  "apiKeys": {
    "groq": ["gsk_key1", "gsk_key2"],
    "nvidia": "nvapi-key1",
    "openrouter": "sk-or-key1"
  },
  "providers": {
    "groq": { "enabled": true },
    "nvidia": { "enabled": true }
  }
}

Priority order:

  1. Environment variables (highest priority)
  2. Config file keys
  3. Multiple keys per provider (tries all before failing over)

Troubleshooting

"No providers configured" → Run free-llm-api-provider --config

"Port 4000 in use"free-llm-api-provider --stop then start again

"Rate limit errors" → Normal with free APIs. Proxy auto-switches. Add more providers for better coverage.

"All providers failed" → Check your API keys are valid with free-llm-api-provider --show

Architecture

  • CLI + Config: Pure Node.js, zero runtime dependencies
  • Model Catalog: 238 models with tiers, replicated from free-coding-models
  • Health Checker: Real-time ping monitoring with quota extraction from rate limit headers (core from free-coding-models)
  • Proxy: Node.js HTTP proxy with health-aware routing + sticky provider + failover
  • Multi-key: Automatically tries all keys per provider before failover
  • Circuit breaker: Temporarily skips failing providers
  • Status Dashboard: Live terminal UI showing provider health (--status)
  • Reliability Analysis: 10s analysis mode to find the most stable provider (--fiable)

License

MIT — Use freely, modify freely, no warranty.

Star History