npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

linkllm

v0.0.1

Published

The unified LLM runtime — local inference, API proxy, and monitoring in one blazing-fast tool. A powerful alternative to Ollama + LiteLLM, built in Rust.

Readme

# Install and run your first model in under 60 seconds
curl -fsSL https://install.linkllm.dev | sh
linkllm pull mistralai/Mistral-7B-Instruct-v0.3-GGUF
linkllm chat mistral

What is LinkLLM?

LinkLLM is a single tool that replaces both Ollama and LiteLLM — plus goes further. It gives you:

  • Local inference of any GGUF model (llama.cpp-powered, pure-Rust candle backend)
  • API proxy to OpenAI, Gemini, Anthropic, Groq, and any OpenAI-compatible endpoint
  • Model management — pull any model from HuggingFace with one command
  • Production-ready REST API with OpenAI-compatible routes, auth, rate limiting, TLS
  • Real-time monitoring dashboard right inside your terminal
  • Multi-model routing with fallback chains and cost tracking

All in a single binary. No Docker required. Works on Windows, macOS, Linux, and Termux.


✨ Features

🦀 Rust-Powered Core

Built on Tokio + Axum — async from the ground up. Memory safe, no garbage collector pauses, minimal footprint.

🤖 Local Model Inference

  • Run GGUF models via llama.cpp FFI bindings — same performance, Rust-safe wrapper
  • Pure Rust inference with candle (no C++ dependency)
  • GPU acceleration: CUDA, ROCm, Apple Metal — auto-detected
  • Quantization: Q4_K_M, Q5_K_S, Q8_0, F16 and more

🌐 Universal API Proxy

Route requests to any provider through a single unified API:

| Provider | Models | |---|---| | OpenAI | gpt-4o, o1, gpt-4-turbo, ... | | Google Gemini | gemini-2.0-flash, gemini-1.5-pro, ... | | Anthropic | claude-3-5-sonnet, claude-3-opus, ... | | Groq | llama3, mixtral (ultra-fast) | | Together AI | 50+ open models | | Any OpenAI-compat | Custom base URL |

📦 HuggingFace Model Pull

linkllm pull mistralai/Mistral-7B-Instruct-v0.3-GGUF
linkllm pull TheBloke/Llama-2-13B-chat-GGUF --quant Q4_K_M
linkllm pull google/gemma-2-9b-it

Resume interrupted downloads. SHA-256 integrity check. Auto-conversion to GGUF.

📊 Terminal Monitoring Dashboard

linkllm monitor

Real-time TUI powered by Ink:

  • Tokens/second live graph
  • Latency histograms (p50 / p95 / p99)
  • Active model memory usage
  • Per-provider cost breakdown
  • Request log (live tail)
  • API key usage tracker
  • Error rate + alerts

🔐 Security-First Design

  • AES-256-GCM encrypted API key store (OS keychain integration)
  • TLS 1.3 by default, mTLS for production
  • HMAC request signing in the Rust SDK
  • JWT bearer tokens for server access
  • Per-key rate limits and quotas
  • Sandboxed model inference

🔀 Multi-Model Routing

Define routing rules in linkllm.toml:

[routing]
default = "mistral"

[[routing.rules]]
match = "code"
model = "deepseek-coder"

[[routing.rules]]
match = "long-context"
model = "gemini-1.5-pro"
fallback = ["gpt-4o", "claude-3-opus"]

⚡ Quick Start

1. Install

Linux / macOS / Termux:

curl -fsSL https://install.linkllm.dev | sh

Windows (PowerShell):

irm https://install.linkllm.dev/windows | iex

Homebrew:

brew install linkllm/tap/linkllm

npm (CLI only):

npm install -g linkllm

pip (Python SDK + CLI):

pip install linkllm

From source:

git clone https://github.com/linkllm/linkllm
cd linkllm
cargo build --release

2. Pull a Model

# Pull from HuggingFace (GGUF auto-detected)
linkllm pull mistralai/Mistral-7B-Instruct-v0.3-GGUF

# Specify quantization
linkllm pull TheBloke/Llama-2-13B-chat-GGUF --quant Q4_K_M

# List downloaded models
linkllm list

3. Chat in Terminal

linkllm chat mistral
linkllm chat gpt-4o          # routes to OpenAI (needs API key)
linkllm chat gemini-flash    # routes to Google Gemini

4. Start the Server

linkllm serve
# Server running at http://localhost:11434
# OpenAI-compatible API at http://localhost:11434/v1

5. Monitor

linkllm monitor

🔌 API

LinkLLM exposes a fully OpenAI-compatible REST API. Drop it in as a replacement for api.openai.com:

Chat Completions

curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-api-key" \
  -d '{
    "model": "mistral",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'

Python — OpenAI SDK Compatible

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="your-api-key"
)

response = client.chat.completions.create(
    model="mistral",
    messages=[{"role": "user", "content": "Explain Rust ownership"}],
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="")

Python — LinkLLM Native SDK

pip install linkllm
import linkllm

client = linkllm.Client()

# Chat with any model — local or API
response = client.chat("mistral", "What is the capital of France?")
print(response.text)

# Streaming
for token in client.stream("gpt-4o", "Write a haiku about Rust"):
    print(token, end="", flush=True)

# Pull a model programmatically
client.pull("TheBloke/Mistral-7B-Instruct-v0.2-GGUF")

# List local models
models = client.list()
for m in models:
    print(f"{m.name} — {m.size_gb:.1f} GB")

TypeScript / JavaScript

npm install linkllm
import { LinkLLM } from "linkllm";

const client = new LinkLLM({ baseUrl: "http://localhost:11434" });

// Chat
const response = await client.chat({
  model: "mistral",
  messages: [{ role: "user", content: "Hello from TypeScript!" }],
});
console.log(response.content);

// Streaming
const stream = client.stream({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Tell me a story" }],
});

for await (const token of stream) {
  process.stdout.write(token);
}

Rust SDK

# Cargo.toml
[dependencies]
linkllm = "0.1"
tokio = { version = "1", features = ["full"] }
use linkllm::{Client, ChatMessage, Role};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::new("http://localhost:11434")?;

    let response = client
        .chat("mistral")
        .message(Role::User, "What is Rust?")
        .send()
        .await?;

    println!("{}", response.content());
    Ok(())
}

⚙️ Configuration

LinkLLM is configured via ~/.linkllm/config.toml:

[server]
host = "127.0.0.1"
port = 11434
tls = false

[models]
default = "mistral"
model_dir = "~/.linkllm/models"

[inference]
gpu_layers = -1        # -1 = auto (offload all to GPU)
context_size = 4096
threads = 8

[api_keys]
# Encrypted. Use `linkllm key add` to set these safely.
openai = ""
gemini = ""
anthropic = ""
groq = ""

[routing]
default = "mistral"
fallback_chain = ["mistral", "gpt-4o-mini"]

[monitoring]
enabled = true
metrics_port = 9090    # Prometheus-compatible /metrics
log_level = "info"

Managing API Keys

linkllm key add openai sk-...
linkllm key add gemini AIza...
linkllm key add anthropic sk-ant-...
linkllm key list
linkllm key rm openai

Keys are stored encrypted with AES-256-GCM, tied to your OS keychain.


📋 CLI Reference

linkllm <command> [options]

Commands:
  serve               Start the LinkLLM server
  chat [model]        Start an interactive chat session
  pull <user/model>   Pull a model from HuggingFace
  push <model>        Push a model to the LinkLLM registry
  list                List all local models
  rm <model>          Remove a local model
  show <model>        Show model info and metadata
  monitor             Open the TUI monitoring dashboard
  key <add|rm|list>   Manage encrypted API keys
  config <get|set>    View or update configuration
  run <model>         Pull (if needed) and start chatting

Options:
  --host              Server host (default: 127.0.0.1)
  --port              Server port (default: 11434)
  --model-dir         Override model storage directory
  --log-level         Log verbosity: error|warn|info|debug|trace
  -v, --version       Print version
  -h, --help          Show help

🆚 Comparison

| | LinkLLM | Ollama | LiteLLM | |---|---|---|---| | Local GGUF inference | ✅ | ✅ | ❌ | | API proxy (OpenAI / Gemini / etc.) | ✅ | ❌ | ✅ | | HuggingFace model pull | ✅ | Partial | ❌ | | TUI monitoring dashboard | ✅ | ❌ | Web UI only | | Multi-model routing + fallback | ✅ | ❌ | ✅ | | Encrypted API key management | ✅ | ❌ | Partial | | Rust core (memory safe) | ✅ | Go | Python | | OpenAI-compatible REST API | ✅ | ✅ | ✅ | | Native Rust SDK | ✅ | ❌ | ❌ | | Pure-Rust inference (candle) | ✅ | ❌ | ❌ | | Mobile / Termux | ✅ | Limited | Limited | | Cost tracking per request | ✅ | ❌ | ✅ | | Single binary, no Docker | ✅ | ✅ | ❌ |


🏗️ Architecture

┌─────────────────────────────────────────────────┐
│              User Interface Layer                │
│   CLI Chat · TUI Monitor · Model Manager        │
│              (TypeScript + Ink)                  │
└────────────────────┬────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────┐
│              API Gateway (Rust/Axum)             │
│   REST API · Auth · Rate Limiter · TLS          │
└────────────────────┬────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────┐
│            Core Engine (Rust/Tokio)              │
│   Router · Pipeline · Context · Metrics         │
└──────┬─────────────┬───────────────┬────────────┘
       │             │               │
┌──────▼──────┐ ┌────▼─────┐ ┌──────▼──────┐
│  Local GGUF │ │  Python  │ │  API Proxy  │
│  llama.cpp  │ │  Bridge  │ │ OAI/Gemini/ │
│  + candle   │ │    HF    │ │  Anthropic  │
└─────────────┘ └──────────┘ └─────────────┘

See the full Architecture Document for details.


📦 Packages

| Package | Registry | Install | |---|---|---| | linkllm (binary) | GitHub Releases | curl -fsSL https://install.linkllm.dev \| sh | | linkllm (CLI) | npm | npm install -g linkllm | | linkllm (Python SDK) | PyPI | pip install linkllm | | linkllm (Rust SDK) | crates.io | cargo add linkllm |


🚀 Roadmap

  • [x] Core Rust engine + Axum server
  • [x] OpenAI-compatible API
  • [x] llama.cpp GGUF inference
  • [x] HuggingFace model pull
  • [x] API proxy (OpenAI, Gemini, Anthropic)
  • [x] TUI monitoring dashboard
  • [x] Encrypted API key management
  • [ ] Multi-model routing (in progress)
  • [ ] candle pure-Rust inference
  • [ ] WebUI dashboard
  • [ ] Model fine-tuning support
  • [ ] Plugin / middleware system
  • [ ] LoRA adapter merge
  • [ ] Distributed inference
  • [ ] LinkLLM Cloud (hosted)

🤝 Contributing

Contributions are welcome! Please read CONTRIBUTING.md before submitting a PR.

git clone https://github.com/linkllm/linkllm
cd linkllm

# Build Rust core
cargo build

# Run tests
cargo test

# Build CLI
cd cli && npm install && npm run build

# Run Python bridge tests
cd python && pip install -e ".[dev]" && pytest

Good first issues are labeled good-first-issue on GitHub.


📄 License

MIT License © 2025 AJ Ashik