@three-ws/ibm-watsonx-mcp
v0.2.0
Published
MCP server for IBM watsonx.ai by three.ws — chat, text generation, embeddings, tokenization, and model discovery with IBM Granite foundation models. Uses your own IBM Cloud credentials.
Downloads
268
Maintainers
Readme
A Model Context Protocol server that exposes IBM watsonx.ai — IBM Granite foundation models and anything else in your account — to MCP clients such as Claude Desktop, Claude Code, and Cursor. It talks directly to the watsonx.ai as-a-Service REST API with your own IBM Cloud credentials: it mints an IAM bearer token from your API key, caches it until just before expiry, and scopes every call to your project. No intermediary backend, no telemetry, no mock data.
Community-built and not affiliated with IBM. Registry name:
io.github.nirholas/ibm-watsonx.
Install
npm install @three-ws/ibm-watsonx-mcpRun it directly with npx (no install needed):
WATSONX_API_KEY=... WATSONX_PROJECT_ID=... npx @three-ws/ibm-watsonx-mcpOr install globally to get the ibm-watsonx-mcp binary on your PATH:
npm install -g @three-ws/ibm-watsonx-mcpQuick start
With Claude Code, one command:
claude mcp add ibm-watsonx -e WATSONX_API_KEY=... -e WATSONX_PROJECT_ID=... -- npx -y @three-ws/ibm-watsonx-mcpOr add the server to your MCP client config (claude_desktop_config.json, Cursor's mcp.json):
{
"mcpServers": {
"ibm-watsonx": {
"command": "npx",
"args": ["-y", "@three-ws/ibm-watsonx-mcp"],
"env": {
"WATSONX_API_KEY": "your-ibm-cloud-api-key",
"WATSONX_PROJECT_ID": "your-watsonx-project-id"
}
}
}
}Restart the client and the six tools below appear. Inspect the surface manually with the MCP Inspector:
npx -y @modelcontextprotocol/inspector npx @three-ws/ibm-watsonx-mcpTools
| Tool | What it does |
| --------------------- | ----------------------------------------------------------------------------------------------- |
| watsonx_chat | Chat completion from a list of role/content messages. Returns the reply plus token usage. |
| watsonx_generate | Raw prompt completion with decoding control (greedy/sample, stop sequences). |
| watsonx_embed | Embedding vectors for one or more texts (up to 1000). One vector per input plus dimensionality. |
| watsonx_tokenize | Token count (and optionally the tokens) for a text against a model tokenizer. |
| watsonx_forecast | Zero-shot time-series forecasting with an IBM Granite TimeSeries (TinyTimeMixer) model. |
| watsonx_list_models | List the foundation models available to your account and region, optionally filtered. |
Every tool is a read-only model-inference call — nothing in your environment or IBM account is modified — and declares MCP tool annotations (readOnlyHint, openWorldHint, idempotentHint) so clients can reason about side effects.
Input parameters
watsonx_chat — messages (required: array of { role, content }, roles system/user/assistant), model, max_tokens (1–4096), temperature (0–2), top_p (0–1).
watsonx_generate — input (required), model, max_new_tokens (1–4096), temperature (0–2), decoding_method (greedy/sample), stop_sequences.
watsonx_embed — inputs (required: 1–1000 texts), model.
watsonx_tokenize — input (required), model, return_tokens (boolean).
watsonx_forecast — timestamps (required: ISO-8601, uniform cadence, oldest first), values (required: numeric, same length), freq (required: pandas cadence, e.g. 1h, 15min, 1D), model, target_column, prediction_length (1–96).
watsonx_list_models — filter (e.g. function_text_generation, function_embedding), limit (1–200, default 100).
Example calls
// watsonx_chat
{ "messages": [{ "role": "user", "content": "Explain MCP in one sentence." }] }
// watsonx_generate
{ "input": "Write a haiku about Kubernetes.", "decoding_method": "sample", "temperature": 0.7, "max_new_tokens": 60 }
// watsonx_embed
{ "inputs": ["lakehouse", "data warehouse"] }
// watsonx_forecast
{ "timestamps": ["2025-01-01T00:00:00Z", "2025-01-01T01:00:00Z"], "values": [1200, 1180], "freq": "1h", "prediction_length": 24 }
// watsonx_list_models
{ "filter": "function_embedding" }How auth works
WATSONX_API_KEY is exchanged for a short-lived IAM bearer token at https://iam.cloud.ibm.com/identity/token (grant_type=urn:ibm:params:oauth:grant-type:apikey). The token is cached in-process and refreshed about five minutes before it expires. Your API key never leaves your machine except in that single IAM exchange.
Requirements
- Node.js >= 20.
- An IBM Cloud account with watsonx.ai provisioned — sign up (free tier available).
- An IBM Cloud API key — create one at https://cloud.ibm.com/iam/apikeys.
- Your watsonx.ai project id — open the project → Manage → General → Project ID (or a deployment space id).
Environment variables
| Variable | Required | Default |
| ------------------------ | ----------------------------------- | ----------------------------------------- |
| WATSONX_API_KEY | yes | — |
| WATSONX_PROJECT_ID | yes (or WATSONX_SPACE_ID) | — |
| WATSONX_SPACE_ID | alternative to WATSONX_PROJECT_ID | — |
| WATSONX_URL | no | https://us-south.ml.cloud.ibm.com |
| WATSONX_MODEL_ID | no | ibm/granite-3-8b-instruct |
| WATSONX_EMBED_MODEL_ID | no | ibm/granite-embedding-278m-multilingual |
| WATSONX_API_VERSION | no | 2024-05-31 |
Regional hosts: us-south, eu-de, eu-gb, jp-tok, au-syd, ca-tor — e.g. https://eu-de.ml.cloud.ibm.com.
Related
@three-ws/ibm-x402-mcp— the same IBM Granite tools as pay-per-use x402 endpoints (USDC on Solana), so callers need no IBM account of their own.
Links
- Homepage: https://three.ws
- Changelog: https://three.ws/changelog
- Issues: https://github.com/nirholas/three.ws/issues
- License: Apache-2.0 — see LICENSE
