mcp-nvidia-orchestrator

v1.1.0

Published

a month ago

NVIDIA NIM Expert Orchestrator with MCP support. A multi-model consensus and synthesis engine.

Downloads

161

0High
0Medium
0Low

elhoptimista

mcp nvidia nim ai orchestrator llama3 multi-model consensus

🟢 NVIDIA NIM Expert Orchestrator (MCP)

A powerful Model Context Protocol (MCP) server that orchestrates a swarm of NVIDIA NIM models. It features an intelligent consensus engine that consults specialized experts and synthesizes their responses for maximum accuracy and creativity.

🚀 Key Features

8 Specialized Expert Groups: Automated routing to models optimized for coding, UI/UX, security, reasoning, and more.
Consensus & Synthesis: Optional multi-model polling to compare expert opinions and generate a unified, high-quality answer.
Library & Server Modes: Use it as a standalone MCP server for Claude/Cursor or as a TypeScript library in your own apps.
Production Ready: Built-in retries, timeouts, and fallback mechanisms using the latest Llama 3.3 70B for synthesis.

🛠️ Installation & Setup

1. Get your NVIDIA API Key

Get your free or enterprise API key at build.nvidia.com.

2. Install as a Server

To use this with a client like Claude Desktop, add this to your claude_desktop_config.json:

{
  "mcpServers": {
    "nvidia-orchestrator": {
      "command": "npx",
      "args": ["-y", "mcp-nvidia-orchestrator"],
      "env": {
        "NVIDIA_API_KEY": "your_api_key_here"
      }
    }
  }
}

3. Use as a Library

npm install mcp-nvidia-orchestrator

import { NvidiaOrchestrator } from 'mcp-nvidia-orchestrator/orchestrator';

const orchestrator = new NvidiaOrchestrator("your_api_key");

const response = await orchestrator.ask(
  "Create a high-performance shader for a glass effect",
  "coding",
  true // Enable multi-model consensus
);

console.log(response.content);

🧠 Expert Groups

The orchestrator routes your requests to specialized model groups:

🏗️ Architecture

The orchestrator uses a Consensus Engine:

Selection: Picks the best expert group for the task.
Parallel Execution: Queries 2-3 expert models simultaneously.
Synthesis: A "Master Model" (Llama 3.3 70B) analyzes all responses and crafts the definitive final output.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Developed with ❤️ for the AI Community.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme