@three-ws/ibm-watsonx-mcp

v0.2.0

Published

10 hours ago

MCP server for IBM watsonx.ai by three.ws — chat, text generation, embeddings, tokenization, and model discovery with IBM Granite foundation models. Uses your own IBM Cloud credentials.

Downloads

268

0High
0Medium
0Low

three-ws

mcp model-context-protocol ibm watsonx watsonx.ai granite ibm-granite foundation-models embeddings llm ai-agent three.ws

A Model Context Protocol server that exposes IBM watsonx.ai — IBM Granite foundation models and anything else in your account — to MCP clients such as Claude Desktop, Claude Code, and Cursor. It talks directly to the watsonx.ai as-a-Service REST API with your own IBM Cloud credentials: it mints an IAM bearer token from your API key, caches it until just before expiry, and scopes every call to your project. No intermediary backend, no telemetry, no mock data.

Community-built and not affiliated with IBM. Registry name: io.github.nirholas/ibm-watsonx.

Install

npm install @three-ws/ibm-watsonx-mcp

Run it directly with npx (no install needed):

WATSONX_API_KEY=... WATSONX_PROJECT_ID=... npx @three-ws/ibm-watsonx-mcp

Or install globally to get the ibm-watsonx-mcp binary on your PATH:

npm install -g @three-ws/ibm-watsonx-mcp

Quick start

With Claude Code, one command:

claude mcp add ibm-watsonx -e WATSONX_API_KEY=... -e WATSONX_PROJECT_ID=... -- npx -y @three-ws/ibm-watsonx-mcp

Or add the server to your MCP client config (claude_desktop_config.json, Cursor's mcp.json):

{
	"mcpServers": {
		"ibm-watsonx": {
			"command": "npx",
			"args": ["-y", "@three-ws/ibm-watsonx-mcp"],
			"env": {
				"WATSONX_API_KEY": "your-ibm-cloud-api-key",
				"WATSONX_PROJECT_ID": "your-watsonx-project-id"
			}
		}
	}
}

Restart the client and the six tools below appear. Inspect the surface manually with the MCP Inspector:

npx -y @modelcontextprotocol/inspector npx @three-ws/ibm-watsonx-mcp

Tools

| Tool | What it does | | --------------------- | ----------------------------------------------------------------------------------------------- | | watsonx_chat | Chat completion from a list of role/content messages. Returns the reply plus token usage. | | watsonx_generate | Raw prompt completion with decoding control (greedy/sample, stop sequences). | | watsonx_embed | Embedding vectors for one or more texts (up to 1000). One vector per input plus dimensionality. | | watsonx_tokenize | Token count (and optionally the tokens) for a text against a model tokenizer. | | watsonx_forecast | Zero-shot time-series forecasting with an IBM Granite TimeSeries (TinyTimeMixer) model. | | watsonx_list_models | List the foundation models available to your account and region, optionally filtered. |

Every tool is a read-only model-inference call — nothing in your environment or IBM account is modified — and declares MCP tool annotations (readOnlyHint, openWorldHint, idempotentHint) so clients can reason about side effects.

Input parameters

watsonx_chat — messages (required: array of { role, content }, roles system/user/assistant), model, max_tokens (1–4096), temperature (0–2), top_p (0–1).

watsonx_generate — input (required), model, max_new_tokens (1–4096), temperature (0–2), decoding_method (greedy/sample), stop_sequences.

watsonx_embed — inputs (required: 1–1000 texts), model.

watsonx_tokenize — input (required), model, return_tokens (boolean).

watsonx_forecast — timestamps (required: ISO-8601, uniform cadence, oldest first), values (required: numeric, same length), freq (required: pandas cadence, e.g. 1h, 15min, 1D), model, target_column, prediction_length (1–96).

watsonx_list_models — filter (e.g. function_text_generation, function_embedding), limit (1–200, default 100).

Example calls

// watsonx_chat
{ "messages": [{ "role": "user", "content": "Explain MCP in one sentence." }] }

// watsonx_generate
{ "input": "Write a haiku about Kubernetes.", "decoding_method": "sample", "temperature": 0.7, "max_new_tokens": 60 }

// watsonx_embed
{ "inputs": ["lakehouse", "data warehouse"] }

// watsonx_forecast
{ "timestamps": ["2025-01-01T00:00:00Z", "2025-01-01T01:00:00Z"], "values": [1200, 1180], "freq": "1h", "prediction_length": 24 }

// watsonx_list_models
{ "filter": "function_embedding" }

How auth works

WATSONX_API_KEY is exchanged for a short-lived IAM bearer token at https://iam.cloud.ibm.com/identity/token (grant_type=urn:ibm:params:oauth:grant-type:apikey). The token is cached in-process and refreshed about five minutes before it expires. Your API key never leaves your machine except in that single IAM exchange.

Requirements

Node.js >= 20.
An IBM Cloud account with watsonx.ai provisioned — sign up (free tier available).
An IBM Cloud API key — create one at https://cloud.ibm.com/iam/apikeys.
Your watsonx.ai project id — open the project → Manage → General → Project ID (or a deployment space id).

Environment variables

| Variable | Required | Default | | ------------------------ | ----------------------------------- | ----------------------------------------- | | WATSONX_API_KEY | yes | — | | WATSONX_PROJECT_ID | yes (or WATSONX_SPACE_ID) | — | | WATSONX_SPACE_ID | alternative to WATSONX_PROJECT_ID | — | | WATSONX_URL | no | https://us-south.ml.cloud.ibm.com | | WATSONX_MODEL_ID | no | ibm/granite-3-8b-instruct | | WATSONX_EMBED_MODEL_ID | no | ibm/granite-embedding-278m-multilingual | | WATSONX_API_VERSION | no | 2024-05-31 |

Regional hosts: us-south, eu-de, eu-gb, jp-tok, au-syd, ca-tor — e.g. https://eu-de.ml.cloud.ibm.com.

@three-ws/ibm-x402-mcp — the same IBM Granite tools as pay-per-use x402 endpoints (USDC on Solana), so callers need no IBM account of their own.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme