@theyahia/salutespeech-mcp
v1.2.0
Published
MCP server for SaluteSpeech — speech recognition and synthesis (Russia)
Maintainers
Readme
@theyahia/salutespeech-mcp
MCP server for Sber SaluteSpeech API — speech recognition (STT) and synthesis (TTS). 5 tools.
Part of Russian API MCP (50 servers) by @theYahia.
Install
Claude Desktop
{
"mcpServers": {
"salutespeech": {
"command": "npx",
"args": ["-y", "@theyahia/salutespeech-mcp"],
"env": { "SALUTESPEECH_API_KEY": "your-base64-key" }
}
}
}Claude Code
claude mcp add salutespeech -e SALUTESPEECH_API_KEY=your-key -- npx -y @theyahia/salutespeech-mcpStreamable HTTP mode
SALUTESPEECH_API_KEY=your-key npx @theyahia/salutespeech-mcp --http --port=3000
# POST http://localhost:3000/mcp
# GET http://localhost:3000/healthAuth
Three options (checked in order):
| Env var | Format |
|---------|--------|
| SALUTESPEECH_API_KEY | Base64-encoded client_id:client_secret |
| SALUTE_AUTH_KEY | Same (legacy alias) |
| SALUTE_SPEECH_CLIENT_ID + SALUTE_SPEECH_CLIENT_SECRET | Raw credentials (auto-encoded) |
OAuth tokens are obtained and refreshed automatically. The scope defaults to
SALUTE_SPEECH_PERS (individuals); set SALUTE_SPEECH_SCOPE for corporate accounts
(SALUTE_SPEECH_CORP — postpaid, SALUTE_SPEECH_B2B — prepaid).
Get credentials at developers.sber.ru.
Tools (5)
| Tool | Description |
|------|-------------|
| recognize_speech | STT from Base64 audio |
| synthesize_speech | TTS, returns Base64 audio |
| list_models | List recognition models and synthesis voices |
| get_task_status | Check async recognition task status |
| recognize_file | STT from a local file path |
Skills
skill-transcribe— guided workflow for audio transcriptionskill-synthesize— guided workflow for speech synthesis
Examples
Transcribe the audio file /tmp/meeting.wav
Synthesize "Hello world" with voice Bys_24000 in wav16 format
List available voicesLimits
recognize_speech / recognize_file use the synchronous endpoint, capped at
2 MB / 1 minute of audio (larger input returns HTTP 413). For multi-channel audio only
the first channel is recognized. Longer recordings need the asynchronous flow
(data:upload → speech:async_recognize → task:get → data:download) — not yet exposed
as tools; get_task_status covers the polling step.
Synthesis input is capped at 4000 characters (incl. spaces and SSML markup).
Troubleshooting
self-signed certificate in certificate chain / UNABLE_TO_VERIFY_LEAF_SIGNATURE
Sber's endpoints use the Russian Trusted Root CA (НУЦ Минцифры), which is not in Node.js's default trust store — so the very first OAuth call fails until you trust it.
Fix: download the root CA (russian_trusted_root_ca_pem.crt) from
gosuslugi.ru/crt and point Node at it:
export NODE_EXTRA_CA_CERTS=/path/to/russian_trusted_root_ca_pem.crtIn an MCP client, add it to the server's env block. Do not set
NODE_TLS_REJECT_UNAUTHORIZED=0 in production — it disables TLS verification entirely.
Official guide: SaluteSpeech certificates.
License
MIT
