modelmix
v4.4.30
Published
🧬 Reliable interface with automatic fallback for AI LLMs.
Maintainers
Readme
🧬 ModelMix: Reliable interface with automatic fallback for AI LLMs
ModelMix is a versatile module that enables seamless integration of various language models from different providers through a unified interface. With ModelMix, you can effortlessly manage and utilize multiple AI models while controlling request rates to avoid provider restrictions. The module also supports the Model Context Protocol (MCP), allowing you to enhance your models with powerful capabilities like web search, code execution, and custom functions.
Ever found yourself wanting to integrate AI models into your projects but worried about reliability? ModelMix helps you build resilient AI applications by chaining multiple models together. If one model fails, it automatically switches to the next one, ensuring your application keeps running smoothly.
✨ Features
- Unified Interface: Interact with multiple AI models through a single, coherent API.
- Request Rate Control: Manage the rate of requests to adhere to provider limitations using Bottleneck.
- Flexible Integration: Easily integrate popular models like OpenAI, Anthropic, Gemini, Perplexity, Groq, Together AI, Lambda, OpenRouter, Ollama, LM Studio or custom models.
- History Tracking: Automatically logs the conversation history with model responses, allowing you to limit the number of historical messages with
max_history. - Model Fallbacks: Automatically try different models if one fails or is unavailable.
- Round Robin Load Balancing: Rotate through multiple models on each request to distribute load and maximize free tier quotas.
- Chain Multiple Models: Create powerful chains of models that work together, with automatic fallback if one fails.
- Model Context Protocol (MCP) Support: Seamlessly integrate external tools and capabilities like web search, code execution, or custom functions through the Model Context Protocol standard.
🛠️ Usage
- Install the ModelMix package:
npm install modelmixAI Skill: You can also add ModelMix as a skill for AI agentic development:
npx skills add https://github.com/clasen/ModelMix --skill modelmix
- Setup your environment variables (.env file): Only the API keys you plan to use are required.
ANTHROPIC_API_KEY="sk-ant-..."
OPENAI_API_KEY="sk-proj-..."
OPENROUTER_API_KEY="sk-or-..."
MINIMAX_API_KEY="your-minimax-key..."
...
GEMINI_API_KEY="AIza..."For environment variables, use dotenv or Node's built-in process.loadEnvFile().
- Create and configure your models:
import { ModelMix } from 'modelmix';
try { process.loadEnvFile(); } catch {}
// Get structured JSON responses
const model = ModelMix.new()
.sonnet46() // Anthropic claude-sonnet-4-6
.addText("Name and capital of 3 South American countries.");
const outputExample = { countries: [{ name: "", capital: "" }] };
console.log(await model.json(outputExample));Chain multiple models with automatic fallback
const setup = {
config: {
system: "You are ALF, if they ask your name, respond with 'ALF'.",
debug: 2
}
};
const model = await ModelMix.new(setup)
.sonnet46() // (main model) Anthropic claude-sonnet-4-5-20250929
.gpt5mini() // (fallback 2) OpenAI gpt-5-mini
.gemini3flash({ config: { temperature: 0 } }) // (fallback 3) Google gemini-3-flash
.grok3mini() // (fallback 4) Grok grok-3-mini
.addText("What's your name?");
console.log(await model.message());Use Perplexity to get the price of ETH
const ETH = ModelMix.new()
.sonar() // Perplexity sonar
.addText('How much is ETH trading in USD?')
.json({ price: 1000.1 });
console.log(ETH.price);This example uses providers with free quotas (OpenRouter, Groq, Cerebras) - just get the API key and you're ready to go. If one model runs out of quota, ModelMix automatically falls back to the next model in the chain.
ModelMix.new()
.gptOss()
.kimiK2()
.deepseekR1()
.hermes3()
.addText('What is the capital of France?');This pattern allows you to:
- Chain multiple models together
- Automatically fall back to the next model if one fails
- Get structured JSON responses when needed
- Track token usage across all providers
- Keep your code clean and maintainable
🔧 Model Context Protocol (MCP) Integration
ModelMix makes it incredibly easy to enhance your AI models with powerful capabilities through the Model Context Protocol. With just a few lines of code, you can add features like web search, code execution, or any custom functionality to your models.
Example: Adding Web Search Capability
Include the API key for Brave Search in your .env file.
BRAVE_API_KEY="BSA0..._fm"const mmix = ModelMix.new({ config: { max_history: 10 } }).gpt5nano();
mmix.setSystem('You are an assistant and today is ' + new Date().toISOString());
// Add web search capability through MCP
await mmix.addMCP('@modelcontextprotocol/server-brave-search');
mmix.addText('Use Internet: When did the last Christian pope die?');
console.log(await mmix.message());This simple integration allows your model to:
- Search the web in real-time
- Access up-to-date information
- Combine AI reasoning with external data
The Model Context Protocol makes it easy to add any capability to your models, from web search to code execution, database queries, or custom functions. All with just a few lines of code!
⚡️ Shorthand Methods
ModelMix provides convenient shorthand methods for quickly accessing different AI models. Here's a comprehensive list of available methods:
| Method | Provider | Model | Price (I/O) per 1 M tokens |
| ------------------ | ---------- | ------------------------------ | -------------------------- |
| gpt54() | OpenAI | gpt-5.4 | $2.50 / $15.00 |
| gpt54mini() | OpenAI | gpt-5.4-mini | $0.75 / $4.50 |
| gpt54nano() | OpenAI | gpt-5.4-nano | $0.20 / $1.25 |
| gpt53codex() | OpenAI | gpt-5.3-codex | $1.25 / $14.00 |
| gpt52() | OpenAI | gpt-5.2 | $1.75 / $14.00 |
| gpt51() | OpenAI | gpt-5.1 | $1.25 / $10.00 |
| gpt5mini() | OpenAI | gpt-5-mini | $0.25 / $2.00 |
| gpt5nano() | OpenAI | gpt-5-nano | $0.05 / $0.40 |
| gpt41() | OpenAI | gpt-4.1 | $2.00 / $8.00 |
| gpt41mini() | OpenAI | gpt-4.1-mini | $0.40 / $1.60 |
| gpt41nano() | OpenAI | gpt-4.1-nano | $0.10 / $0.40 |
| gptOss() | Together | gpt-oss-120B | $0.15 / $0.60 |
| opus46[think]() | Anthropic | claude-opus-4-6 | $5.00 / $25.00 |
| opus45[think]() | Anthropic | claude-opus-4-5-20251101 | $5.00 / $25.00 |
| sonnet46[think]()| Anthropic | claude-sonnet-4-6 | $3.00 / $15.00 |
| sonnet45[think]()| Anthropic | claude-sonnet-4-5-20250929 | $3.00 / $15.00 |
| haiku45[think]() | Anthropic | claude-haiku-4-5-20251001 | $1.00 / $5.00 |
| gemini31pro() | Google | gemini-3.1-pro-preview | $2.00 / $12.00 |
| gemini3pro() | Google | gemini-3-pro-preview | $2.00 / $12.00 |
| gemini3flash() | Google | gemini-3-flash-preview | $0.50 / $3.00 |
| grok4() | Grok | grok-4-0709 | $3.00 / $15.00 |
| grok41[think]() | Grok | grok-4-1-fast | $0.20 / $0.50 |
| deepseekV32() | Fireworks | fireworks/models/deepseek-v3p2 | $0.56 / $1.68 |
| GLM47() | Fireworks | fireworks/models/glm-4p7 | $0.55 / $2.19 |
| minimaxM27() | MiniMax | MiniMax-M2.7 | $0.30 / $1.20 |
| sonar() | Perplexity | sonar | $1.00 / $1.00 |
| sonarPro() | Perplexity | sonar-pro | $3.00 / $15.00 |
| hermes3() | Lambda | Hermes-3-Llama-3.1-405B-FP8 | $0.80 / $0.80 |
| qwen3() | Together | Qwen3-235B-A22B-fp8-tput | $0.20 / $0.60 |
| kimiK2() | Together | Kimi-K2-Instruct | $1.00 / $3.00 |
| kimiK25think() | Together | Kimi-K2.5 | $0.50 / $2.80 |
Each method accepts optional options and config parameters to customize the model's behavior. For example:
const result = await ModelMix.new({
options: { temperature: 0.7 },
config: { system: "You are a helpful assistant" }
})
.sonnet46()
.addText("Tell me a story about a cat");
.message();🔄 Templates
ModelMix includes a simple but powerful templating system. You can write your system prompts and user messages in external .md files with placeholders, then use replace to fill them in at runtime.
Core methods
| Method | Description |
| --- | --- |
| setSystemFromFile(path) | Load the system prompt from a file |
| addTextFromFile(path) | Load a user message from a file |
| replace({ key: value }) | Replace placeholders in all messages and the system prompt |
| replaceKeyFromFile(key, path) | Replace a placeholder with the contents of a file |
Basic example with replace
const gpt = ModelMix.new().gpt52();
gpt.addText('Write a short story about a {animal} that lives in {place}.');
gpt.replace({ '{animal}': 'cat', '{place}': 'a haunted castle' });
console.log(await gpt.message());Loading prompts from .md files
Instead of writing long prompts inline, keep them in separate Markdown files. This makes them easier to read, edit, and version control.
prompts/system.md
You are {role}, an expert in {topic}.
Always respond in {language}.prompts/task.md
Analyze the following and provide 3 key insights:
{content}app.js
const gpt = ModelMix.new().gpt5mini();
gpt.setSystemFromFile('./prompts/system.md');
gpt.addTextFromFile('./prompts/task.md');
gpt.replace({
'{role}': 'a senior analyst',
'{topic}': 'market trends',
'{language}': 'Spanish',
'{content}': 'Bitcoin surpassed $100,000 in December 2024...'
});
console.log(await gpt.message());Injecting file contents into a placeholder
Use replaceKeyFromFile when the replacement value itself is a large text stored in a file.
prompts/summarize.md
Summarize the following article in 3 bullet points:
{article}app.js
const gpt = ModelMix.new().gpt5mini();
gpt.addTextFromFile('./prompts/summarize.md');
gpt.replaceKeyFromFile('{article}', './data/article.md');
console.log(await gpt.message());Full template workflow
Combine all methods to build reusable, file-based prompt pipelines:
prompts/system.md
You are {role}. Follow these rules:
- Be concise
- Use examples when possible
- Respond in {language}prompts/review.md
Review the following code and suggest improvements:
{code}app.js
const gpt = ModelMix.new().gpt5mini();
gpt.setSystemFromFile('./prompts/system.md');
gpt.addTextFromFile('./prompts/review.md');
gpt.replace({ '{role}': 'a senior code reviewer', '{language}': 'English' });
gpt.replaceKeyFromFile('{code}', './src/utils.js');
console.log(await gpt.message());🧩 JSON Structured Output
The json method forces the model to return a structured JSON response. You define the shape with an example object and optionally describe each field.
await model.json(schemaExample, schemaDescription, options)Basic usage
const model = ModelMix.new()
.gpt5mini()
.addText('Name and capital of 3 South American countries.');
const result = await model.json({ countries: [{ name: "", capital: "" }] });
console.log(result);
// { countries: [{ name: "Argentina", capital: "Buenos Aires" }, ...] }Adding field descriptions
The second argument lets you describe each field so the model understands exactly what you expect. Descriptions can be strings (simple) or descriptor objects (with metadata):
const result = await model.json(
{ countries: [{ name: "Argentina", capital: "BUENOS AIRES" }] },
{ countries: [{ name: "name of the country", capital: "capital of the country in uppercase" }] },
{ addNote: true }
);
// { countries: [
// { name: "Brazil", capital: "BRASILIA" },
// { name: "Colombia", capital: "BOGOTA" },
// { name: "Chile", capital: "SANTIAGO" }
// ]}Enhanced descriptors
Descriptions support descriptor objects with description, required, enum, and default:
const result = await model.json(
{ name: 'Martin', age: 22, sex: 'male' },
{
name: { description: 'Name of the actor', required: false },
age: 'Age of the actor', // string still works
sex: { description: 'Gender', enum: ['male', 'female', null], default: null }
}
);| Property | Type | Default | Description |
| --- | --- | --- | --- |
| description | string | — | Field description for the model |
| required | boolean | true | If false, field is removed from required and type becomes nullable |
| enum | array | — | Allowed values. If includes null, type auto-becomes nullable |
| default | any | — | Default value for the field |
You can mix strings and descriptor objects freely in the same descriptions parameter.
Array auto-wrap
When you pass a top-level array as the example, ModelMix automatically wraps it for better LLM compatibility and unwraps the result transparently:
const result = await model.json([{ name: 'martin' }]);
// result is an array: [{ name: "Martin" }, { name: "Carlos" }, ...]Internally, the array is wrapped as { out: [...] } so the model receives a proper object schema, then result.out is returned automatically.
Options
| Option | Default | Description |
| --- | --- | --- |
| addSchema | true | Include the generated JSON schema in the system prompt |
| addExample | false | Include the example object in the system prompt |
| addNote | false | Add a note about JSON escaping to prevent parsing errors |
// Include the example and the escaping note
const result = await model.json(
{ name: "John", age: 30, skills: ["JavaScript"] },
{ name: "Full name", age: "Age in years", skills: "List of programming languages" },
{ addExample: true, addNote: true }
);These options give you fine-grained control over how much guidance you provide to the model for generating properly formatted JSON responses.
📊 Token Usage Tracking
ModelMix automatically tracks token usage for all requests across different providers, providing a unified format regardless of the underlying API.
How it works
Every response from raw() now includes a tokens object with the following structure:
{
tokens: {
input: 150, // Number of tokens in the prompt/input
output: 75, // Number of tokens in the completion/output
total: 225, // Total tokens used (input + output)
cached: 100, // Cached input tokens reported by the provider (0 when absent)
cost: 0.0012, // Estimated cost in USD (null if model not in pricing table)
speed: 42 // Output tokens per second (int)
}
}lastRaw — Access full response after message() or json()
After calling message() or json(), use lastRaw to access the complete response (tokens, thinking, tool calls, etc.). It has the same structure as raw().
const text = await model.message();
console.log(model.lastRaw.tokens);
// { input: 122, output: 86, total: 208, cached: 41, cost: 0.000319, speed: 38 }The cached field is a single aggregated count of cached input tokens reported by the provider. The cost field is the estimated cost in USD based on the model's pricing per 1M tokens (input/output). If the model is not found in the pricing table, cost will be null. The speed field is the generation speed measured in output tokens per second (integer).
🐛 Enabling Debug Mode
To activate debug mode in ModelMix and view detailed request information, follow these two steps:
In the ModelMix constructor, include a
debuglevel in the configuration:const mix = ModelMix.new({ config: { debug: 4 // 0=silent, 1=minimal, 2=summary, 3=full (no truncate), 4=verbose (raw details) // ... other configuration options ... } });When running your script from the command line, use the
DEBUG=ModelMix*prefix:DEBUG=ModelMix* node your_script.js
When you run your script this way, you'll see detailed information about the requests in the console, including the configuration and options used for each AI model request.
This information is valuable for debugging and understanding how ModelMix is processing your requests.
🚦 Bottleneck Integration
ModelMix now uses Bottleneck for efficient rate limiting of API requests. This integration helps prevent exceeding API rate limits and ensures smooth operation when working with multiple models or high request volumes.
How it works:
- Configuration: Bottleneck is configured in the ModelMix constructor. You can customize the settings or use the default configuration:
const setup = {
config: {
bottleneck: {
maxConcurrent: 8, // Maximum number of concurrent requests
minTime: 500 // Minimum time between requests (in ms)
}
}
};Rate Limiting: When you make a request using any of the attached models, Bottleneck automatically manages the request flow based on the configured settings.
Automatic Queueing: If the rate limit is reached, Bottleneck will automatically queue subsequent requests and process them as capacity becomes available.
This integration ensures that your application respects API rate limits while maximizing throughput, providing a robust solution for managing multiple AI model interactions.
📚 ModelMix Class Overview
new ModelMix(args = { options: {}, config: {} })- args: Configuration object with
optionsandconfigproperties.- options: This object contains default options that are applied to all models. These options can be overridden when creating a specific model instance. Examples of default options include:
max_tokens: Sets the maximum number of tokens to generate, e.g., 2000.temperature: Controls the randomness of the model's output, e.g., 1.- ...(Additional default options can be added as needed)
- config: This object contains configuration settings that control the behavior of the
ModelMixinstance. These settings can also be overridden for specific model instances. Examples of configuration settings include:system: Sets the default system message for the model, e.g., "You are an assistant."max_history: Limits the number of historical messages to retain, e.g., 1.roundRobin: Whentrue, rotates through attached models on each request for load balancing. Whenfalse(default), uses fallback mode where models are tried sequentially only if previous ones fail.bottleneck: Configures the rate limiting behavior using Bottleneck. For example:maxConcurrent: Maximum number of concurrent requestsminTime: Minimum time between requests (in ms)reservoir: Number of requests allowed in the reservoir periodreservoirRefreshAmount: How many requests are added when the reservoir refreshesreservoirRefreshInterval: Reservoir refresh interval
- ...(Additional configuration parameters can be added as needed)
- options: This object contains default options that are applied to all models. These options can be overridden when creating a specific model instance. Examples of default options include:
Methods
attach(modelKey, modelInstance): Attaches a model instance to theModelMix.new():staticCreates a newModelMix.new(): Creates a newModelMixusing instance setup.setSystem(text): Sets the system prompt.setSystemFromFile(filePath): Sets the system prompt from a file.addText(text, config = { role: "user" }): Adds a text message.addTextFromFile(filePath, config = { role: "user" }): Adds a text message from a file.addImage(filePath, config = { role: "user" }): Adds an image message from a file path.addImageFromUrl(url, config = { role: "user" }): Adds an image message from URL.replace(keyValues): Defines placeholder replacements for messages and system prompt.replaceKeyFromFile(key, filePath): Defines a placeholder replacement with file contents as value.message(): Sends the message and returns the response.raw(): Sends the message and returns the complete response data including:message: The text response from the modelthink: Reasoning/thinking content (if available)toolCalls: Array of tool calls made by the model (if any)tokens: Object withinput,output,total, andcachedtoken counts, pluscost(USD) andspeed(output tokens/sec)response: The raw API response
stream(callback): Sends the message and streams the response, invoking the callback with each streamed part.json(schemaExample, descriptions = {}, options = {}): Forces the model to return a response in a specific JSON format.schemaExample: Example of the JSON structure to be returned. Top-level arrays are auto-wrapped for better LLM compatibility.descriptions: Descriptions for each field — can be strings or descriptor objects with{ description, required, enum, default }.options:{ addSchema: true, addExample: false, addNote: false }- Returns a Promise that resolves to the structured JSON response
- Example:
const response = await handler.json( { time: '24:00:00', message: 'Hello' }, { time: 'Time in format HH:MM:SS', message: { description: 'Greeting', required: false } } );
block({ addText = true }): Forces the model to return a response in a specific block format.
MixCustom Class Overview
new MixCustom(args = { config: {}, options: {}, headers: {} })- args: Configuration object with
config,options, andheadersproperties.- config:
url: The endpoint URL to which the model sends requests.prefix: An array of strings used as a prefix for requests.- ...(Additional configuration parameters can be added as needed)
- options: This object contains default options that are applied to all models. These options can be overridden when creating a specific model instance. Examples of default options include:
max_tokens: Sets the maximum number of tokens to generate, e.g., 2000.temperature: Controls the randomness of the model's output, e.g., 1.top_p: Controls the diversity of the output, e.g., 1.- ...(Additional default options can be added as needed)
- headers:
authorization: The authorization header, typically including a Bearer token for API access.x-api-key: A custom header for API key if needed.- ...(Additional headers can be added as needed)
- config:
MixOpenAI Class Overview
new MixOpenAI(args = { config: {}, options: {} })- args: Configuration object with
configandoptionsproperties.- config: Specific configuration settings for OpenAI, including the
apiKey. - options: Default options for OpenAI model instances.
- config: Specific configuration settings for OpenAI, including the
MixOpenRouter Class Overview
new MixOpenRouter(args = { config: {}, options: {} })- args: Configuration object with
configandoptionsproperties.- config: Specific configuration settings for OpenRouter, including the
apiKey. - options: Default options for OpenRouter model instances.
- config: Specific configuration settings for OpenRouter, including the
MixAnthropic Class Overview
new MixAnthropic(args = { config: {}, options: {} })- args: Configuration object with
configandoptionsproperties.- config: Specific configuration settings for Anthropic, including the
apiKey. - options: Default options for Anthropic model instances.
- config: Specific configuration settings for Anthropic, including the
MixPerplexity Class Overview
new MixPerplexity(args = { config: {}, options: {} })- args: Configuration object with
configandoptionsproperties.- config: Specific configuration settings for Perplexity, including the
apiKey. - options: Default options for Perplexity model instances.
- config: Specific configuration settings for Perplexity, including the
MixPerplexity Class Overview
new MixGroq(args = { config: {}, options: {} })- args: Configuration object with
configandoptionsproperties.- config: Specific configuration settings for Perplexity, including the
apiKey. - options: Default options for Perplexity model instances.
- config: Specific configuration settings for Perplexity, including the
MixOllama Class Overview
new MixOllama(args = { config: {}, options: {} })- args: Configuration object with
configandoptionsproperties.- config: Specific configuration settings for Ollama.
url: The endpoint URL to which the model sends requests.
- options: Default options for Ollama model instances.
- config: Specific configuration settings for Ollama.
MixLMStudio Class Overview
new MixLMStudio(args = { config: {}, options: {} })- args: Configuration object with
configandoptionsproperties.- config: Specific configuration settings for Ollama.
url: The endpoint URL to which the model sends requests.
- options: Default options for Ollama model instances.
- config: Specific configuration settings for Ollama.
MixTogether Class Overview
new MixTogether(args = { config: {}, options: {} })- args: Configuration object with
configandoptionsproperties.- config: Specific configuration settings for Together AI, including the
apiKey. - options: Default options for Together AI model instances.
- config: Specific configuration settings for Together AI, including the
MixGoogle Class Overview
new MixGoogle(args = { config: {}, options: {} })- args: Configuration object with
configandoptionsproperties.- config: Specific configuration settings for Google Gemini, including the
apiKey. - options: Default options for Google Gemini model instances.
- config: Specific configuration settings for Google Gemini, including the
🤝 Contributing
Contributions are welcome! If you find any issues or have suggestions for improvement, please open an issue or submit a pull request on the GitHub repository.
📄 License
The MIT License (MIT)
Copyright (c) Martin Clasen
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
