npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

loclaude

v0.0.5

Published

Claude Code with local Ollama LLMs - Zero API costs, no rate limits, complete privacy

Readme

loclaude

Read the docs

Claude Code with Local LLMs

Stop burning through Claude API usage limits. Run Claude Code's powerful agentic workflow with local Ollama models on your own hardware.

Requires ollama v0.14.2 or higher

Zero API costs. No rate limits. Complete privacy.

npm version License: MIT

Quick StartWhy loclaude?InstallationFAQ


Why loclaude?

Real Value

  • No Rate Limits: Use Claude Code as much as you want
  • Privacy: Your code never leaves your machine
  • Cost Control: Use your own hardware, pay for electricity not tokens
  • Offline Capable: Work without internet (after model download)
  • GPU or CPU: Works with NVIDIA GPUs or CPU-only systems

What to Expect

loclaude provides:

  • One-command setup for Ollama + Open WebUI containers
  • Smart model management with auto-loading
  • GPU auto-detection with CPU fallback
  • Project scaffolding with Docker configs

Installation

# With npm (requires Node.js 18+)
npm install -g loclaude

# With bun (faster, recommended)
bun install -g loclaude # use bun-loclaude for commands

vs. Other Solutions

| Solution | Cost | Speed | Privacy | Limits | |----------|------|-------|---------|--------| | loclaude | Free after setup | Fast (GPU) | 100% local | None | | Claude API/Web | $20-200+/month | Fast | Cloud-based | Rate limited | | GitHub Copilot | $10-20/month | Fast | Cloud-based | Context limited | | Cursor/Codeium | $20+/month | Fast | Cloud-based | Usage limits |

loclaude gives you the utility of Ollama with the convenience of a managed solution for claude code integration.

Quick Start (5 Minutes)

# 1. Install loclaude
npm install -g loclaude

# 2. Install Claude Code (if you haven't already)
npm install -g @anthropic-ai/claude-code

# 3. Setup your project (auto-detects GPU)
loclaude init

# 4. Start Ollama container
loclaude docker-up

# 5. Pull a model (choose based on your hardware)
loclaude models-pull qwen3-coder:30b    # GPU with 16GB+ VRAM
# OR
loclaude models-pull qwen2.5-coder:7b   # CPU or limited VRAM

# 6. Run Claude Code with unlimited local LLM
loclaude run

That's it! You now have unlimited Claude Code sessions with local models.

Prerequisites

Required:

Optional (for GPU acceleration):

CPU-only systems work fine! Use --no-gpu flag during init and smaller models.

Check your setup:

loclaude doctor

Features

Automatic Model Loading

When you run loclaude run, it automatically:

  1. Checks if your selected model is loaded in Ollama
  2. If not loaded, warms up the model with a 10-minute keep-alive (Configurable through env vars)
  3. Shows [loaded] indicator in model selection for running models

GPU Auto-Detection

loclaude init automatically detects NVIDIA GPUs and configures the appropriate Docker setup:

  • GPU detected: Uses runtime: nvidia and CUDA-enabled images
  • No GPU: Uses CPU-only configuration with smaller default models

Commands

Running Claude Code

loclaude run                    # Interactive model selection
loclaude run -m qwen3-coder:30b # Use specific model
loclaude run -- --help          # Pass args to claude

Project Setup

loclaude init                   # Auto-detect GPU, scaffold project
loclaude init --gpu             # Force GPU mode
loclaude init --no-gpu          # Force CPU-only mode
loclaude init --force           # Overwrite existing files
loclaude init --no-webui        # Skip Open WebUI in compose file

Docker Management

loclaude docker-up              # Start containers (detached)
loclaude docker-up --no-detach  # Start in foreground
loclaude docker-down            # Stop containers
loclaude docker-status          # Show container status
loclaude docker-logs            # Show logs
loclaude docker-logs --follow   # Follow logs
loclaude docker-restart         # Restart containers

Model Management

loclaude models                 # List installed models
loclaude models-pull <name>     # Pull a model
loclaude models-rm <name>       # Remove a model
loclaude models-show <name>     # Show model details
loclaude models-run <name>      # Run model interactively (ollama CLI)

Diagnostics

loclaude doctor                 # Check prerequisites
loclaude config                 # Show current configuration
loclaude config-paths           # Show config file search paths

Recommended Models

For GPU (16GB+ VRAM) - Best Experience

| Model | Size | Speed | Quality | Best For | |-------|------|-------|---------|----------| | qwen3-coder:30b | ~17 GB | ~50-100 tok/s | Excellent | Most coding tasks, refactoring, debugging | | deepseek-coder:33b | ~18 GB | ~40-80 tok/s | Excellent | Code understanding, complex logic |

Recommendation: Start with qwen3-coder:30b for the best balance of speed and quality.

For CPU or Limited VRAM (<16GB) - Still Productive

| Model | Size | Speed | Quality | Best For | |-------|------|-------|---------|----------| | qwen2.5-coder:7b | ~4 GB | ~10-20 tok/s | Good | Code completion, simple refactoring | | deepseek-coder:6.7b | ~4 GB | ~10-20 tok/s | Good | Understanding existing code | | llama3.2:3b | ~2 GB | ~15-30 tok/s | Fair | Quick edits, file operations |

Configuration

loclaude supports configuration via files and environment variables.

Config Files

Config files are loaded in priority order:

  1. ./.loclaude/config.json (project-local)
  2. ~/.config/loclaude/config.json (user global)

Example config:

{
  "ollama": {
    "url": "http://localhost:11434",
    "defaultModel": "qwen3-coder:30b"
  },
  "docker": {
    "composeFile": "./docker-compose.yml",
    "gpu": true
  },
  "claude": {
    "extraArgs": ["--verbose"]
  }
}

Environment Variables

| Variable | Description | Default | |----------|-------------|---------| | OLLAMA_URL | Ollama API endpoint | http://localhost:11434 | | OLLAMA_MODEL | Default model name | qwen3-coder:30b | | LOCLAUDE_COMPOSE_FILE | Path to docker-compose.yml | ./docker-compose.yml | | LOCLAUDE_GPU | Enable GPU (true/false) | true |

Priority

Configuration is merged in this order (highest priority first):

  1. CLI arguments
  2. Environment variables
  3. Project config (./.loclaude/config.json)
  4. User config (~/.config/loclaude/config.json)
  5. Default values

Service URLs

When containers are running:

| Service | URL | Description | |---------|-----|-------------| | Ollama API | http://localhost:11434 | LLM inference API | | Open WebUI | http://localhost:3000 | Chat interface |

Project Structure

After running loclaude init:

.
├── .claude/
│   └── CLAUDE.md          # Claude Code instructions
├── .loclaude/
│   └── config.json        # Loclaude configuration
├── models/                # Ollama model storage (gitignored)
├── docker-compose.yml     # Container definitions (GPU or CPU mode)
├── mise.toml              # Task runner configuration
└── README.md

Using with mise

The init command creates a mise.toml with convenient task aliases:

mise run up              # loclaude docker-up
mise run down            # loclaude docker-down
mise run claude          # loclaude run
mise run pull <model>    # loclaude models-pull <model>
mise run doctor          # loclaude doctor

FAQ

Is this really unlimited?

Yes! Once you have models downloaded, you can run as many sessions as you want with zero additional cost.

How does the quality compare to Claude API?

30B parameter models (qwen3-coder:30b) are comparable to GPT-3.5 and work okay for most coding tasks. Larger models have a bit more success. Claude API is still better, but this allows for continuing work when you have hit that pesky usage limit.

Do I need a GPU?

No, but highly recommended. CPU-only mode works with smaller models at ~10-20 tokens/sec. A GPU (16GB+ VRAM) gives you 50-100 tokens/sec with larger, better models.

Can I use this with the Claude API too?

Absolutely! Keep using Claude API for critical tasks, use loclaude for everything else to save money and avoid limits.

Troubleshooting

Check System Requirements

loclaude doctor

This verifies:

  • Docker and Docker Compose installation
  • NVIDIA GPU detection (optional)
  • NVIDIA Container Toolkit (optional)
  • Claude Code CLI
  • Ollama API connectivity

Container Issues

# View logs
loclaude docker-logs --follow

# Restart containers
loclaude docker-restart

# Full reset
loclaude docker-down && loclaude docker-up

Connection Issues

If Claude Code can't connect to Ollama:

  1. Verify Ollama is running: loclaude docker-status
  2. Check the API: curl http://localhost:11434/api/tags
  3. Verify your config: loclaude config

GPU Not Detected

If you have a GPU but it's not detected:

  1. Check NVIDIA drivers: nvidia-smi
  2. Test Docker GPU access: docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi
  3. Install NVIDIA Container Toolkit if missing
  4. Re-run loclaude init --gpu to force GPU mode

Running on CPU

If inference is slow on CPU:

  1. Use smaller, quantized models: qwen2.5-coder:7b, llama3.2:3b
  2. Expect ~10-20 tokens/sec on modern CPUs
  3. Consider cloud models via Ollama: glm-4.7:cloud

Getting Help

  • Issues/Bugs: GitHub Issues
  • Questions: GitHub Discussions
  • Documentation: Run loclaude --help or check this README
  • System Check: Run loclaude doctor to diagnose problems

Development

Building from Source

git clone https://github.com/nicholasgalante1997/loclaude.git loclaude
cd loclaude
bun install
bun run build

Running Locally

# With bun (direct)
bun bin/index.ts --help

# With node (built)
node bin/index.mjs --help

Testing

# Test both runtimes
bun bin/index.ts doctor
node bin/index.mjs doctor

License

MIT