xswarm-freeloader
v2.0.5
Published
Intelligent AI router that maximizes free tier usage across 100+ LLM providers
Maintainers
Readme
$ npx xswarm-freeloader
███████╗██████╗ ███████╗███████╗
██╔════╝██╔══██╗██╔════╝██╔════╝
█████╗ ██████╔╝█████╗ █████╗
██╔══╝ ██╔══██╗██╔══╝ ██╔══╝
██║ ██║ ██║███████╗███████╗
╚═╝ ╚═╝ ╚═╝╚══════╝╚══════╝
██╗ ██████╗ █████╗ ██████╗ ███████╗██████╗
██║ ██╔═══██╗██╔══██╗██╔══██╗██╔════╝██╔══██╗
██║ ██║ ██║███████║██║ ██║█████╗ ██████╔╝
██║ ██║ ██║██╔══██║██║ ██║██╔══╝ ██╔══██╗
███████╗╚██████╔╝██║ ██║██████╔╝███████╗██║ ██║
╚══════╝ ╚═════╝ ╚═╝ ╚═╝╚═════╝ ╚══════╝╚═╝ ╚═╝
"Your AI provider's worst nightmare."
Freeloader is a local LLM proxy router that rotates requests around current
free-tier offers in an attempt to reduce or wipe out your AI bill.
✓ Setting up...
Created ~/.xswarm/
SQLite database initialized (WAL mode)
pm2 process manager ready
✓ Scanning for local models...
Ollama detected — 3 models (llama3.2, mistral, phi3)
Local models added to private trust tier
✓ Provider catalog synced
Gemini ............ 3 free models
Groq .............. 2 free models
Mistral ........... 1 free model
OpenRouter ........ 2 free models
Ollama ............ 3 local models
Together, OpenAI, Anthropic (paid fallback)
Free first → lowest cost next → premium last
✓ Router + Dashboard → http://localhost:4011
Somewhere, a pricing page is crying.The problem
You're paying OpenAI $200/month. Meanwhile:
- Gemini 2.0 Flash gives you 1,500 free requests/day with 1M context
- Groq gives you 14,400 free requests/day on Llama 3.3 70B
- Mistral gives you 500 free requests/day
- OpenRouter has a dozen models at literally $0.00
That's millions of free tokens per month across providers. But juggling API keys, rate limits, failovers, and response formats across all of them? Nobody has time for that.
Freeloader does it for you. One command. One endpoint. Zero config.
Install
npx xswarm-freeloaderThat's the whole install. It creates ~/.xswarm/, starts the router and dashboard via pm2, and opens the dashboard in your browser. Survives reboots.
Use it
Change one line. Your code stays the same:
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'http://localhost:4011/v1', // ← this line
apiKey: 'free',
});
// Freeloader intercepts this and routes to the best free model
const res = await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello!' }],
});Works with any OpenAI-compatible SDK, framework, or tool — LangChain, LlamaIndex, Vercel AI SDK, Continue, Cursor, you name it. If it takes an OPENAI_BASE_URL, it works.
# or just curl it
curl http://localhost:4011/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o","messages":[{"role":"user","content":"hello"}]}'How it works
Your app (unchanged)
│
▼
┌──────────────────────────────────────────────┐
│ Freeloader Router (localhost:4011) │
│ │
│ 1. Score all available models │
│ 2. Filter by trust tier & capabilities │
│ 3. Sanitize sensitive content (optional) │
│ 4. Pick the best free model that fits │
│ 5. Fall back to paid if free is exhausted │
│ 6. Enforce your budget (hard stop at $0) │
│ 7. Return standard OpenAI response shape │
└──────────────────────────────────────────────┘
│
▼
Gemini · Groq · Mistral · OpenRouter · Together · Ollama · ...Freeloader scores every available model on three axes — cost, speed, and quality — filtered by your trust requirements. Free tiers are always preferred. When they're exhausted, it falls back to your cheapest paid option. When your budget says stop, it stops.
Features
Intelligent routing
- Cost/speed/quality scoring with configurable weights
- Automatic free tier rotation — exhausts every free tier before spending a cent
- Circuit breaker health monitoring with automatic provider failover
- Degradation scoring — tracks provider reliability over time and routes away from flaky backends
- Capability-aware — automatically detects tool use, vision, and long context requirements
Security & isolation
- Per-app API keys with hash-based authentication (raw keys never stored)
- Per-app policies — allowed/blocked providers, budget overrides, sanitization profiles per consumer
- Trust tiers — control exactly which providers see which data
- Request sanitization — optional PII redaction and secret detection before requests leave your machine
- Content policy enforcement — block requests containing secrets or sensitive patterns
Observability
- Multi-range reporting — 24h, 7d, 30d, 90d views with trend comparisons
- Professional PDF reports — 4-page executive summaries generated locally, emailed on schedule
- HTML email digests — savings highlights, provider breakdowns, error rates, opportunities
- Per-app analytics — drill down into any app's usage, cost, and provider mix
- Hourly breakdown — see traffic patterns across the day
- Config versioning — every settings change is versioned with rollback support
Email reports (optional)
- Resend integration — free tier (100 emails/day), just add your API key in the dashboard
- Custom SMTP — use any email provider you already have
- Test reports — send a test report from the dashboard to verify your setup
- Reports include PDF attachment with executive summary, hourly breakdowns, provider distribution, and growth metrics
Dashboard
A local web UI at http://localhost:4011 — dark paper theme, 8 views:
| View | What it does | |------|-------------| | Overview | Live request feed, spend tracking, provider health, cost/request charts | | Providers | Full catalog — health, models, capabilities, free tier status, degradation scores | | Apps | Create API consumers with per-app budgets, trust tiers, and sanitization profiles | | App Detail | Deep dive into a single app — keys, policies, usage timeseries, provider mix | | Routing | Tune cost/speed/quality weights with live sliders, set quality gates | | Usage | Filterable request log, cost breakdowns by provider, app, and day | | Opportunities | Suggestions for saving more — unused free tiers, missing API keys | | Settings | Password, email reports (Resend/SMTP), port config, config versioning |
Security: your data never leaves your machine
Freeloader is architecturally local-only. There is no Freeloader cloud service, no hosted backend, no telemetry server. Everything runs on your machine:
- Your API keys are encrypted with AES-256-CBC and stored in a local SQLite database at
~/.xswarm/freeloader.db. They never leave your machine except to authenticate with the provider you chose. - Your request content passes from your app → Freeloader → the AI provider you configured. Freeloader never copies, logs, or transmits your prompt content to any Freeloader infrastructure (there is none).
- Your usage data (token counts, costs, latency) is stored locally for your dashboard and reports. It is never sent anywhere.
- Your reports are generated locally as PDF files saved to
~/.xswarm/reports/. Email delivery is strictly opt-in — you must configure it yourself. - Dashboard auth uses bcrypt-hashed passwords and JWT tokens, all local.
The only network calls Freeloader makes:
- Catalog sync — fetches the public model catalog from
catalog.freeloader.xswarm.ai(a static JSON file listing available providers/models, no user data sent) - AI provider requests — your prompts go to the providers you configured, exactly as they would if you called them directly
- Email delivery — only if you explicitly enable it and configure an email provider
If your hard budget is $0 and you only use free tiers, the cost to you is literally nothing. No subscription, no usage fees, no data monetization. MIT licensed.
Free tier coverage
These are real, production-quality models with generous free tiers:
| Provider | Free Models | Daily Limit | Highlights | |----------|-------------|-------------|------------| | Google Gemini | Gemini 2.5 Pro, 2.0 Flash, Flash Lite | 25-1,500 req/day | 1M context, vision, tools | | Groq | Llama 3.3 70B, Gemma 2 9B | 14,400 req/day | Fastest inference anywhere | | Mistral | Mistral Small | 500 req/day | 128K context, EU-hosted | | OpenRouter | Llama 3.3 70B, Gemini Flash | Varies | Aggregated free models | | Local | Anything via Ollama/LM Studio | Unlimited | Private, zero-cost, zero-latency |
Combined: millions of free tokens per month. Freeloader automatically rotates through providers as rate limits are hit, so you get the maximum possible free usage before any paid fallback.
Trust tiers
Not all data should go everywhere. Tag your requests:
| Tier | Where data goes | Use case |
|------|----------------|----------|
| open | Any provider globally | Public content, non-sensitive tasks |
| standard | US/EU providers only | Business data, PII with DPA coverage |
| private | Local models only | Medical, legal, secrets — never leaves your machine |
// private data stays on your machine
await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: patientRecord }],
metadata: { trust_tier: 'private' }, // routes to Ollama only
});Per-app policies
Create isolated API consumers with granular controls:
// Each app gets its own API key, budget, trust tier, and routing rules
const app = {
name: 'customer-chatbot',
trust_tier: 'standard',
budget_daily_hard: 5.00,
sanitization_profile: 'standard', // auto-redact PII
allowed_providers: ['openai', 'anthropic', 'gemini'],
blocked_providers: ['openrouter'],
};App keys use hash-based authentication — the raw API key is shown once at creation and never stored. Keys can be rotated and revoked from the dashboard without downtime.
Budget enforcement
Set a hard budget and Freeloader enforces it. No surprises.
{
"budget": {
"hard": { "daily": 0.00, "monthly": 0.00 },
"soft": { "daily": 0.00, "monthly": 0.00 }
}
}Yes, you can set your hard budget to $0.00 and Freeloader will only use free tiers. Soft limits trigger alerts. Hard limits kill the request. Per-app budgets let you give different consumers different limits.
Configuration
Everything lives in ~/.xswarm/config.json:
{
"version": "2.0",
"routing": {
"strategy": "balanced",
"weights": { "cost": 0.4, "speed": 0.4, "quality": 0.2 }
},
"budget": {
"hard": { "daily": 10.00, "monthly": 200.00 },
"soft": { "daily": 5.00, "monthly": 100.00 }
},
"server": {
"routerPort": 4011,
"port": 4011
},
"email": {
"enabled": false,
"provider": "resend",
"apiKey": "re_...",
"to": "[email protected]",
"digestFrequency": "daily"
}
}Or just use the dashboard. Most people never touch this file. Every config change is versioned and can be rolled back from the dashboard.
CLI
npx xswarm-freeloader # Install / ensure running
npx xswarm-freeloader --status # Provider health, spend, stats
npx xswarm-freeloader --restart # Restart router and dashboard
npx xswarm-freeloader --remove # Stop everything, optionally delete dataAPI reference
Standard OpenAI-compatible endpoints:
| Method | Path | Description |
|--------|------|-------------|
| POST | /v1/chat/completions | Chat completions (streaming supported) |
| POST | /v1/embeddings | Text embeddings |
| GET | /v1/models | List all available models |
| GET | /v1/health | Health check |
| * | /api/* | Dashboard data API (JWT auth) |
Query parameters for routing introspection:
?debug=routing— returns candidate list, scores, policy info in the response?app_id=my-app— identify the calling app (alternative to API key auth)
Provider adapters
8 native JavaScript adapters. No Python. No LiteLLM. No subprocess. Just fetch().
| Provider | Protocol | Free Tier | |----------|----------|-----------| | OpenAI | Native | No | | Anthropic | Messages API → OpenAI format | No | | Google Gemini | Generative AI API → OpenAI format | Yes | | Groq | OpenAI-compatible | Yes | | Mistral | OpenAI-compatible | Yes | | Together AI | OpenAI-compatible | No | | OpenRouter | OpenAI-compatible | Yes (some models) | | Local (Ollama / LM Studio) | OpenAI-compatible | Always free |
Adding a provider is ~80 lines. If it speaks OpenAI, it's a 20-line subclass.
Architecture
npx xswarm-freeloader
└─ setup.js → creates ~/.xswarm/, starts pm2 processes
├─ xswarm-router (port 4011) — Fastify gateway
│ ├─ /v1/* — OpenAI-compatible API
│ ├─ /api/* — dashboard data + admin API
│ ├─ scorer → cost/speed/quality ranking
│ ├─ quality gates → trust tier & capability filtering
│ ├─ sanitizer → PII redaction & secret detection
│ ├─ fallback engine → provider rotation with circuit breakers
│ ├─ degradation scorer → provider reliability tracking
│ ├─ budget enforcer → per-app hard/soft limits
│ ├─ config manager → versioned config with rollback
│ └─ report generator → multi-range PDF & email digests
└─ dashboard (served at /) — Svelte 5 SPA
└─ 8 views, 7 components, dark paper themeStack: Fastify, better-sqlite3, Svelte 5, Tailwind CSS, Vite, pdfkit, nodemailer, pm2. No Python, no Docker, no YAML. Just Node.
Development
git clone https://github.com/chadananda/xswarm-freeloader.git
cd xswarm-freeloader
npm install
npm test # 481 tests
npm run dev # Watch mode (router)
npm run dashboard:dev # Dashboard dev serversrc/
bin/ CLI entry point
install/ Setup, remove, status, client detection
db/ SQLite schema, 8 migrations, 9 repositories
config/ Loader, defaults, manager, Zod schemas
router/ Fastify server, auth, scorer, sanitizer, fallback, quality gates
providers/ 8 native adapters, registry, health monitor, degradation scorer
budget/ Tracker and enforcer
email/ Multi-range digest, PDF reports, alerts, Resend/SMTP mailer
dashboard/ Svelte 5 SPA — 8 views, 7 components
utils/ Crypto (AES-256, bcrypt, HMAC), logger, error classes
scripts/
seed-mock-data.js 90-day mock data seeder (~74K rows)
send-mock-reports.js Seed + generate + email test
tests/ 481 tests (unit, integration, BDD, load)
catalog/ Cloudflare Worker (model catalog API)
website/ Astro 5 marketing siteWhy not just use LiteLLM / OpenRouter / etc?
| | Freeloader | LiteLLM | OpenRouter |
|--|-----------|---------|------------|
| Install | npx xswarm-freeloader | Python + Docker + config | Sign up + API key |
| Runtime | Node.js only | Python subprocess | Cloud service |
| Free tier optimization | Built-in, automatic | Manual config | No |
| Budget enforcement | Per-app hard limits | Basic | No |
| Trust tiers | open / standard / private | No | No |
| Request sanitization | Built-in PII/secret detection | No | No |
| Per-app policies | Keys, budgets, provider rules | No | No |
| Health monitoring | Circuit breakers + degradation | Basic | Managed |
| Reporting | Multi-range PDF + email | No | Basic |
| Self-hosted | Always | Yes | No |
| Data privacy | Your machine only | Your machine | Their servers |
| Price | $0 forever | $0 (OSS) | Usage-based |
Freeloader is purpose-built for one thing: getting your AI bill to $0. Everything else is a side effect of doing that well.
Privacy statement
Freeloader is local-only software. It runs entirely on your machine.
- No telemetry. We do not collect usage data, analytics, error reports, or crash dumps.
- No phone home. The only network request to our infrastructure is an anonymous catalog fetch (a static JSON file of available models). No user data is sent.
- No account required. There is no sign-up, no login, no email collection.
- No data monetization. We do not sell, share, or analyze your data because we do not have your data.
- Your API keys are AES-256-CBC encrypted in a local SQLite database. They authenticate with providers you chose, and nowhere else.
- Your prompts pass directly from your app to the AI provider through Freeloader's local proxy. They are never logged, stored, or transmitted to any Freeloader service.
- Your usage metrics exist only in
~/.xswarm/freeloader.dbon your machine. Reports are generated locally. Email delivery is opt-in.
Full privacy policy: freeloader.xswarm.ai/privacy
Part of the xswarm ecosystem
Freeloader is the open-source foundation of xswarm — a suite of tools for developers building with AI agents.
| Tool | What it does | |------|-------------| | Freeloader | Free tier router (you are here) | | xswarm-ai-sanitize | Detect and redact secrets before they reach your agents | | xswarm-buzz | Promotion orchestration — give your AI agent a marketing brain |
License
MIT. Use it, fork it, ship it. We don't care — your AI provider cares enough for all of us.
