npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

mcp-external-expert

v0.2.1

Published

MCP server that allows primary LLMs to delegate sub-tasks to external expert models via APIs (Ollama, OpenAI-compatible)

Downloads

34

Readme

MCP External Expert Server

An MCP (Model Context Protocol) server that allows a primary LLM to delegate sub-tasks to external expert models via APIs (Ollama, OpenAI-compatible).

Think of it like "phone a friend" - when the primary model needs help with planning, critique, or reasoning, it can call external expert models for assistance.

Installation

Install from npmjs.com:

npm install -g mcp-external-expert

That's it! The package is ready to use.

Purpose

This server enables:

  • Primary LLMs (e.g., Qwen3 Coder, Claude, GPT-4, etc.) to delegate planning, critique, testing, and explanation tasks to external expert models
  • Avoids unloading/cache loss on the primary llama-server
  • Supports routing to Ollama or OpenAI-compatible endpoints
  • Configurable via environment variables
  • Supports both STDIO (for desktop tools) and HTTP (for remote/shared usage)

Configuration

The server can be configured via environment variables, either:

  • Environment variables (set in your shell or system)
  • .env file (recommended for local development - automatically loaded)

Using .env File (Recommended)

  1. Copy the example file:

    cp .env.example .env
  2. Edit .env with your settings:

    DELEGATE_PROVIDER=ollama
    DELEGATE_BASE_URL=http://localhost:11434
    DELEGATE_MODEL=qwen2.5:14b-instruct
    DELEGATE_API_KEY=your-api-key-here

The .env file is gitignored and will not be committed to version control.

Environment Variables

Provider Selection

# In .env file or as environment variables:
DELEGATE_PROVIDER=ollama | openai_compat
DELEGATE_BASE_URL=http://host:port
DELEGATE_MODEL=model-name

OpenAI-compatible Only

# In .env file or as environment variables:
DELEGATE_API_KEY=sk-...
DELEGATE_OPENAI_PATH=/v1/chat/completions

Behavior

# Timeout for API calls in milliseconds (default: 60000 = 60 seconds)
# Increase this if your Ollama server is slow (e.g., 300000 for 5 minutes)
DELEGATE_TIMEOUT_MS=60000
DELEGATE_MAX_TOKENS=800
DELEGATE_TEMPERATURE=0.2

Optional Per-Mode System Prompts

DELEGATE_SYSTEM_PLAN="..."
DELEGATE_SYSTEM_CRITIC="..."
DELEGATE_SYSTEM_TESTS="..."
DELEGATE_SYSTEM_EXPLAIN="..."

MCP Transport Toggles

MCP_HTTP=true
MCP_HTTP_PORT=3333
MCP_STDIO=true   # default

Usage

Development

For local development, clone the repository and install dependencies:

git clone <repository-url>
cd mcp-external-expert-server
npm install
npm run dev

Production

npm run build
npm start

Testing

# Run tests
npm test

# Run tests in watch mode
npm run test:watch

# Run tests with coverage
npm run test:coverage

Example Runs

Using .env File (Recommended)

  1. Create .env file with your configuration:

    cp .env.example .env
    # Edit .env with your settings
  2. Run the server:

    npm start

Using Environment Variables

Ollama Helper (Remote Box)

DELEGATE_PROVIDER=ollama \
DELEGATE_BASE_URL=http://ollama-box:11434 \
DELEGATE_MODEL=qwen2.5:14b-instruct \
npm start

llama-server OpenAI API

DELEGATE_PROVIDER=openai_compat \
DELEGATE_BASE_URL=http://localhost:8080 \
DELEGATE_MODEL=qwen2.5:14b-instruct \
DELEGATE_API_KEY="" \
npm start

Enable HTTP MCP

MCP_HTTP=true MCP_HTTP_PORT=3333 npm start

Note: Environment variables set on the command line will override values in .env files.

Exposed MCP Tool

Tool: delegate

Delegates a subtask to an external expert model.

Input Schema:

{
  "mode": "plan | review | challenge | explain | tests",
  "input": "string (required)",
  "context": "string (optional)",
  "maxChars": "number (optional, default 12000)"
}

Modes:

  • plan → step-by-step plan + assumptions + risks
  • review → code review - identify bugs, quality issues, and provide fixes (code-specific)
  • challenge → devil's advocate - challenge ideas and find flaws in any concept/proposal (general)
  • tests → test checklist + edge cases
  • explain → concise explanation

Supported Providers

1. Ollama (Recommended)

  • Keeps a helper model warm on a separate machine
  • No auth complexity
  • No impact on primary llama.cpp cache

Uses: POST /api/chat

2. OpenAI-compatible Endpoints

Works with:

  • OpenAI
  • llama-server (--api)
  • LiteLLM
  • vLLM OpenAI shims

Uses: POST /v1/chat/completions

Transport Modes

STDIO (Default)

Used by:

  • Cursor
  • Goose Desktop
  • Claude Desktop
  • Other MCP desktop tools

JSON-RPC over stdin/stdout.

HTTP MCP (Optional)

  • Long-running server
  • Shared across machines
  • Keeps helper model hot
  • Supports both regular HTTP POST and SSE (Server-Sent Events) streaming
  • CORS enabled for web-based clients (MCP Inspector, etc.)

Endpoints:

  • POST /mcp - Main MCP endpoint (JSON-RPC)
  • GET/POST /sse - SSE streaming endpoint
  • GET/POST /mcp - Also supports SSE streaming

This is MCP over HTTP using the Streamable HTTP transport specification, which supports:

  • Regular HTTP POST requests (JSON-RPC)
  • SSE (Server-Sent Events) for streaming responses
  • CORS headers for browser-based clients
  • Compatible with MCP Inspector, Goose Desktop, Cursor, and other MCP clients

Security Notes

  • HTTP mode should be LAN-only or behind auth
  • Delegated prompts may contain sensitive code
  • STDIO mode is safest by default
  • Secrets in input are automatically redacted

Design Notes

  • The helper model must not call tools recursively
  • The helper model output is returned as plain text
  • The main model decides when to delegate (like "phoning a friend" when it needs help)
  • Delegation should be used sparingly (planning, critique, validation)
  • This avoids KV cache eviction on the primary inference host
  • The helper model is completely isolated - it only sees what the primary model explicitly passes to it