npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@thelogicatelier/muxa

v0.1.2

Published

Universal LLM proxy for Cursor, Claude Code, Cline, Antigravity, and OpenAI-compatible tools. Run on any model or provider.

Readme

Muxa — Universal LLM Proxy

Muxa is a self-hosted proxy that presents Anthropic- and OpenAI-compatible APIs so IDE/CLI tooling (Claude Code, Cursor, Codex, Continue, Copilot, etc.) can run against your choice of provider—cloud or local—without touching client settings.

Why Muxa?

  • One URL for many providers – Point every client at http://localhost:8081, then change providers (OpenRouter, Azure, Databricks, Ollama, etc.) centrally via .env.
  • Auto routing & fallback – Send simple prompts to a local Ollama model, heavy workloads to a cloud model, and fall back automatically when a provider fails.
  • Token optimization – Prompt cache, semantic cache, memory injection, and headroom compression operate server-side so all clients enjoy reduced latency/cost.
  • Observability + policy controls – Built-in load shedding, circuit breaker, structured logs, and Prometheus endpoints give production visibility; policy guards enforce workspace/host/git/test rules.
  • Advanced providers – Some tools only speak OpenAI/Anthropic; Muxa converts that traffic to providers they don’t natively support (OpenRouter, Ollama, MLX).
  • Easy rollouts – Update .env once; every IDE routed through Muxa immediately uses the new provider/policy set.

Quick Start

npm (Recommended)

# Install globally
npm install -g @thelogicatelier/muxa

# Create a .env with your provider key (see docs for all options)
muxa                  # proxy listens on http://localhost:8081

Or run instantly without installing:

npx @thelogicatelier/muxa

From Source

git clone https://github.com/achatt89/muxa.git
cd muxa
npm install
cp .env.example .env  # fill in OPENAI_API_KEY, OPENROUTER_API_KEY, etc.
npm start             # proxy listens on http://localhost:8081

Docker

The example below uses OpenAI, but you can substitute any supported provider (Anthropic, OpenRouter, Ollama, Databricks, Azure, etc.):

docker build -t muxa .
docker run --rm -p 8081:8081 \
  -e MUXA_PRIMARY_PROVIDER=openai \
  -e OPENAI_API_KEY=sk-your-key \
  muxa:latest

Homebrew (macOS)

brew tap thelogicatelier/muxa https://github.com/achatt89/muxa.git
brew install muxa
muxa --help

Multi-Provider Routing Modes

| Mode | Description | |------|-------------| | single | All requests go to MUXA_PRIMARY_PROVIDER. Use this when you only want one provider. | | hybrid | Muxa evaluates request “complexity” (prompt length, tool use, etc.) and routes high-cost calls to MUXA_FALLBACK_PROVIDER. If the primary fails, the fallback is also used. |

Example .env:

MUXA_ROUTING_STRATEGY=hybrid
MUXA_PRIMARY_PROVIDER=openrouter
MUXA_FALLBACK_PROVIDER=anthropic
OPENROUTER_API_KEY=sk-or-...
ANTHROPIC_API_KEY=sk-ant-...

Use single mode if you don’t need fallback.

Token Optimization & Memory

Enable caches and memory to cut token usage:

MUXA_PROMPT_CACHE_ENABLED=true
MUXA_PROMPT_CACHE_TTL_MS=120000
MUXA_SEMANTIC_CACHE_ENABLED=true
MUXA_SEMANTIC_CACHE_THRESHOLD=0.9
MUXA_MEMORY_ENABLED=true
MUXA_MEMORY_TOPK=3
MUXA_HEADROOM_ENABLED=true
MUXA_HEADROOM_MODE=optimize
  • Prompt cache instantly returns repeated prompts.
  • Semantic cache reuses answers for similar prompts (requires embeddings provider—see docs/embeddings.md).
  • Memory store injects top-K memories into each request.
  • Headroom exposes /metrics/compression and /headroom/* to track savings.

Variable descriptions:

| Variable | Purpose | |----------|---------| | MUXA_PROMPT_CACHE_ENABLED | Enable exact match cache; repeated prompts return instantly. | | MUXA_PROMPT_CACHE_TTL_MS | Time-to-live (milliseconds) for prompt cache entries. | | MUXA_SEMANTIC_CACHE_ENABLED | Enable semantic (embeddings-based) cache. Set to true once embeddings are configured. | | MUXA_SEMANTIC_CACHE_THRESHOLD | Cosine similarity threshold (0-1) for semantic cache hits. | | MUXA_MEMORY_ENABLED | Enable long-term memory extraction/storage. | | MUXA_MEMORY_TOPK | Number of memories injected into each request when relevant. | | MUXA_HEADROOM_ENABLED | Enable headroom sidecar/compression pipeline. | | MUXA_HEADROOM_MODE | audit (record metrics only) or optimize (mutate/compress history). |

Client Overrides (Cursor, Claude, Codex, Copilot)

  1. Start Muxa (npm or Docker as shown above).
  2. Point clients at Muxa:

| Client | Configuration | |--------|---------------| | Claude Code CLI | ANTHROPIC_BASE_URL=http://localhost:8081 ANTHROPIC_API_KEY=sk-muxa claude "Prompt" (or export those vars once before running). | | Cursor IDE | Settings → Features → Models → Base URL http://localhost:8081/v1, API key sk-muxa, select the model configured in .env. For @Codebase, enable embeddings docs/embeddings.md. | | OpenAI Codex CLI | codex -c model_provider='"muxa"' -c model='"gpt-5.2-codex"' -c 'model_providers.muxa={name="Muxa Proxy",base_url="http://localhost:8081/v1",wire_api="responses",api_key="sk-muxa"}' (no config change needed). | | GitHub Copilot | export GITHUB_COPILOT_PROXY_URL=http://localhost:8081/v1, export GITHUB_COPILOT_PROXY_KEY=dummy, restart the editor (Works for VS Code / JetBrains). | | Cline / Continue / ClawdBot / other OpenAI-compatible tools | Set their custom OpenAI endpoint to http://localhost:8081/v1 with API key sk-muxa and use your desired model name.

macOS launchctl environment helpers

When you launch IDEs via Spotlight or the Dock, macOS ignores shell exports. Persist API keys for GUI apps (VS Code, Cursor, Claude CLI, etc.) with launchctl:

# Persist environment variables for GUI-launched apps
## NO NEED to provide the actual key - just needs a dummy value so that openAI/Anthropic don't complain). The actual keys are read from the .env file
launchctl setenv OPENAI_API_KEY sk-muxa
launchctl setenv ANTHROPIC_API_KEY sk-muxa
launchctl setenv MUXA_BASE_URL http://localhost:8081

# Inspect current values
launchctl getenv OPENAI_API_KEY
launchctl getenv ANTHROPIC_API_KEY

# Remove when rotating credentials
launchctl unsetenv OPENAI_API_KEY
launchctl unsetenv ANTHROPIC_API_KEY

After setting values, quit/reopen the IDE so it inherits the updated environment.

Observability & Diagnostics

  • /dashboard – lightweight HTML dashboard showing health, metrics, routing, compression, and headroom status (auto-refreshing)
  • /health, /health/live, /health/ready – readiness probes
  • /routing/stats, /debug/session, /v1/agents/* – routing + agent diagnostics
  • /metrics, /metrics/prometheus, /metrics/compression, /metrics/semantic-cache – Prometheus-ready metrics, headroom/semantic cache stats
  • /health/headroom, /headroom/status, /headroom/restart, /headroom/logs – headroom lifecycle

Cost Optimization & Multi-Model Strategy

Muxa’s proxy lives between every IDE and the upstream provider, so optimization happens once and benefits everything:

  1. Caching chain – Prompt cache handles exact repeat prompts instantly; semantic cache (with embeddings) reuses near-duplicates; the memory store injects curated snippets for long-running projects.
  2. Headroom compression – Enable MUXA_HEADROOM_ENABLED=true to shrink chat histories and reduce token spend while keeping context intact—inspect savings via /metrics/compression.
  3. Hybrid routing – Set MUXA_ROUTING_STRATEGY=hybrid with a fallback provider to auto-route tool-heavy or long prompts to a premium model while keeping cheap models for short requests. Model aliases/fallbacks map IDE-only IDs (e.g., gpt-5.3-codex) to real upstream SKUs without touching editor config.
  4. Instrumentation loop/api/tokens/stats, /metrics/semantic-cache, /routing/stats, and the dashboard show exactly when caches hit or fallback routing triggers so you can prove savings to the team.

Best results happen when you warm caches (replay tests or common prompts), keep session_id/user identifiers consistent, and leave the proxy running for a day or two so semantic cache + headroom gather enough data. See docs/cost-optimization.md for the full playbook, configuration examples, timelines, and troubleshooting tips.

For a deep dive into how hybrid routing scores requests and auto-switches between providers, see docs/routing.md.

Embeddings & @Codebase Support

See docs/embeddings.md for Ollama, llama.cpp, OpenRouter, and OpenAI embeddings configuration. Example (Ollama):

ollama pull nomic-embed-text
export MUXA_SEMANTIC_CACHE_ENABLED=true
export OLLAMA_BASE_URL=http://localhost:11434
export OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
npm start

Testing

npm test                    # 90+ suites covering API/provider/integration/perf
node scripts/endpoint-parity-preflight.js

Docker Compose Example

See docker-compose.example.yml for a sample proxy + Ollama stack.

Additional Documentation

Detailed GitBook-style docs live under docs/:


Built by The Logic Atelier