npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

waifu-code

v2.0.1

Published

Free coding assistant CLI — Claude Code powered by NVIDIA NIM

Readme

waifu CLI

A streamlined proxy and wrapper for Claude Code that transparently routes API requests through your choice of AI provider, eliminating the need for complex Python proxy setups.

By default, it uses NVIDIA NIM with moonshotai/kimi-k2-thinking for intelligent responses and natively supports Anthropic streaming APIs.

Installation

First, ensure you have the official Claude Code CLI installed:

npm install -g @anthropic-ai/claude-code

Note: Claude Code has moved to a native installer in recent versions. If you see a prompt to update, follow it — it won't break waifu.

Then, install the waifu-code CLI globally:

npm install -g waifu-code

Providers

waifu now supports multiple AI providers. The original NVIDIA NIM is the default, but you can switch with a single flag:

| Provider | Free? | Works with Claude Code? | Notes | |---|---|---|---| | NVIDIA NIM (default) | Free tier | ✓ Yes | Best free option, high token limits | | OpenRouter | Free models available | ✓ Yes | Recommended — no token size limits | | Ollama | Completely free | ⚠ Depends | Needs 32b+ for reliable tool use |

Usage

Simply run:

waifu

On first launch, it will prompt you for your API key. This key is saved automatically to ~/.waifu/config.json.

waifu will immediately start the integrated TypeScript proxy in the background on a random available port and securely launch your locally-installed claude-code CLI natively. No manual configuration or $env modifications are required!

Using a different provider

# NVIDIA NIM (original default)
waifu --provider nim --key nvapi-xxx

# OpenRouter — recommended free option
waifu --provider openrouter --key sk-or-xxx --model openrouter/free

# OpenRouter with a specific model
waifu --provider openrouter --key sk-or-xxx --model nvidia/nemotron-3-super-120b-a12b:free

# Ollama — fully local, no key needed
waifu --provider ollama --model qwen2.5:32b

Options

Usage: waifu [options]

Run the Claude Code CLI through your chosen AI provider proxy.

Options:
  -v, --version              Output the current version
  --provider <n>             AI provider: nim, openrouter, ollama (default: nim)
  --key <key>                API key for the chosen provider (saved automatically)
  --nim-key <key>            NVIDIA NIM API key (shorthand)
  --openrouter-key <key>     OpenRouter API key (shorthand)
  --model <model>            Model to use (overrides per-provider default)
  --port <port>              Port to run the proxy on (default: auto)
  --proxy-only               Start only the proxy server without launching claude
  --no-waifu                 Disable the waifu overlay
  --verbose                  Enable verbose logging for debugging
  -h, --help                 Display help for command

Commands:
  model                      Interactively select a new model for the current provider
  config                     View or update saved configuration
  providers                  List all supported providers and default models

Saving your config

# Save provider + model so you never have to type flags again
waifu config --provider openrouter --model openrouter/free

# View current saved config
waifu config

# Reset everything
waifu config --reset

Provider notes

OpenRouter

The most reliable free option. openrouter/free automatically picks from all currently available free models:

waifu --provider openrouter --key sk-or-xxx --model openrouter/free

Free model names change over time — if you get a 404 on a specific model name, switch to openrouter/free or check openrouter.ai/models for models marked :free.

Recommended free models that work well with Claude Code:

  • nvidia/nemotron-3-super-120b-a12b:free — large, reads files autonomously, good tool use
  • deepseek/deepseek-r1:free — strong reasoning
  • openrouter/free — auto-selects, always available

Ollama (local)

Ollama runs models entirely on your machine — no internet, no API key, no cost.

Setup:

  1. Download from ollama.com
  2. Pull a model: ollama pull qwen2.5:32b
  3. On Linux, start the server first: ollama serve

Model guide by RAM:

| RAM | Recommended | Command | |---|---|---| | 16GB | qwen2.5:14b or mistral-nemo | ollama pull qwen2.5:14b | | 32GB | qwen2.5:32b | ollama pull qwen2.5:32b | | 64GB+ | qwen2.5:72b | ollama pull qwen2.5:72b |

Known limitations with small models (below 32b):

  • Models may hallucinate tool names not in Claude Code's schema (e.g. Glob, simplify, GloballySearch) — these silently do nothing
  • Models tend to ask clarifying questions instead of reading files autonomously
  • Tool call formatting is inconsistent

waifu handles one common Ollama issue automatically: some models output tool calls as raw JSON text instead of the structured API format. The proxy detects and converts both formats:

  • Bullet+XML: ● <function=Name><parameter=key>value</parameter>
  • Plain JSON: { "name": "ToolName", "arguments": { ... } }

For reliable agentic use locally, 32b+ models are recommended.

How It Works

This tool is a drop-in replacement for the original Python proxy server. It relies on a hyper-efficient native NodeJS integration:

  1. Intercepts Anthropic Server-Sent Events (SSE).
  2. Converts Claude's messages format to OpenAI-compatible JSON (which all supported providers speak).
  3. Fixes Anthropic API quirks (like ?beta=true queries and streaming header strictness) seamlessly underneath the hood.
  4. Auto-detects and preserves <think> tags returned by supported models without breaking the CLI UI experience.
  5. Short-circuits trivial requests (quota checks, title generation, suggestion mode) locally without hitting any API.
  6. Detects raw-text tool calls from local models and converts them to proper tool use blocks.