@runhuman/mcp

v1.0.0

Published

11 days ago

MCP server for AI-orchestrated human QA testing. Let AI agents request real human testing of web applications.

0High
0Medium
0Low

busanix

aaronvolter

mcp model-context-protocol qa testing human-in-the-loop ai-agent claude anthropic qa-testing manual-testing

@runhuman/mcp

A Model Context Protocol (MCP) server that enables AI agents to orchestrate on-demand human QA testing through the Runhuman service.

Note: This is the MCP server package. For the main CLI tool, see runhuman (coming soon).

What is this?

This MCP server allows AI agents (like Claude, GPT, or any MCP-compatible AI) to request human testing of web applications. The AI defines what to test and what data format it needs back, and real humans perform the testing with structured results extracted automatically.

Perfect for:

AI coding agents that need human verification of their work
Automated workflows requiring human-in-the-loop testing
CI/CD pipelines with human QA gates
Visual/UX testing that requires real human judgment

Quick Start

Claude Desktop

Get your API key at runhuman.com/dashboard
Add to your Claude Desktop config:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "runhuman": {
      "command": "npx",
      "args": ["-y", "@runhuman/mcp", "--api-key=qa_live_xxxxxxxxxxxxx"]
    }
  }
}

Restart Claude Desktop

That's it! Claude can now request human testing:

"Can someone test my checkout page at https://myapp.com/checkout and verify the payment flow works?"

Other MCP Clients

For other MCP clients (VS Code extensions, custom agents, etc.):

# Install globally
npm install -g @runhuman/mcp

# Run with your API key
runhuman-mcp --api-key=qa_live_xxxxxxxxxxxxx

Or use npx without installation:

npx @runhuman/mcp --api-key=qa_live_xxxxxxxxxxxxx

Features

Available Tools

`create_job`

Create a new human QA testing job.

Parameters:

url (string, required): The URL to test
description (string, required): Natural language instructions for the human tester describing what to test
schema (object, optional): JSON schema defining the structured output format you want back
targetDurationMinutes (number, optional): Time limit for tester (default: 5, range: 1-60)

Example:

{
  url: "https://myapp.com/checkout",
  description: "Test the checkout flow: add item to cart, proceed to checkout, and verify payment options are displayed correctly.",
  schema: {
    type: "object",
    properties: {
      checkout_loaded: { type: "boolean" },
      payment_options_visible: { type: "boolean" },
      any_errors: { type: "string" }
    }
  },
  targetDurationMinutes: 5
}

Returns:

jobId: Unique identifier for tracking the job
status: Current job status (pending, claimed, in_progress, completed, failed, timeout)
estimatedCompletionTime: When the test is expected to complete
url: The URL being tested
description: Test instructions

`wait_for_result`

Check status and retrieve results for a QA job.

Parameters:

jobId (string, required): The ID returned from create_job
waitSeconds (number, optional): How long to wait before checking (default: 30, range: 1-300)

Behavior:

Checks status immediately (returns right away if already complete)
Waits for the specified duration if not complete
Checks status again after waiting
Returns results if complete, otherwise suggests calling again

Returns when complete:

result: Structured test results matching your schema
status: "completed"
costUsd: Exact cost in USD (e.g., 0.396)
testDurationSeconds: Time spent by tester (rounded up)
testerResponse: Raw natural language feedback from the human
testerAlias: Anonymized tester name (e.g., "Tester Alpha")
testerAvatarUrl: Avatar image URL
testerColor: Hex color for theming
Additional metadata (timestamps, etc.)

Example workflow:

// Create job
const job = await create_job({
  url: "https://myapp.com",
  description: "Test the login page"
});

// Wait for result with increasing intervals
let result = await wait_for_result(job.jobId, { waitSeconds: 30 });
if (result.status !== 'completed') {
  result = await wait_for_result(job.jobId, { waitSeconds: 60 });
}

Configuration

API URL (Optional)

By default, the server connects to the production API at https://runhuman.com. For local development or testing:

{
  "mcpServers": {
    "runhuman": {
      "command": "npx",
      "args": [
        "-y",
        "@runhuman/mcp",
        "--api-key=qa_live_xxxxx",
        "--api-url=http://localhost:3000"
      ]
    }
  }
}

Environment Variables

Alternatively, use environment variables:

export RUNHUMAN_API_KEY=qa_live_xxxxxxxxxxxxx
export RUNHUMAN_API_URL=https://runhuman.com  # optional

Pricing

Runhuman uses pay-per-second pricing:

$0.0018 per second of tester time
Duration is rounded up to the nearest second
Example: 220 seconds of testing = $0.396

Costs are calculated exactly and returned in the API response. No monthly fees, no minimums.

How It Works

AI creates job - Your AI agent calls create_job with test instructions
Job posted to testers - Request goes to the human tester pool via Slack
Human performs test - A real person tests the application (video/screenshots recorded)
Human reports findings - Tester describes what they observed in natural language
AI extracts data - GPT-4o processes feedback into structured JSON matching your schema
AI gets results - wait_for_result returns clean, typed data ready for automation

Example Use Cases

AI Coding Agents

User: "Can you update my checkout page and verify it works?"
Agent: [Updates code, deploys, then uses create_job to verify]
Agent: "✅ Updated and verified by human tester. Payment flow works correctly."

CI/CD Pipelines

- name: Human QA Gate
  run: |
    # Deploy preview
    # Call Runhuman API via MCP
    # Fail build if test fails

Visual Testing

Agent: "Please verify the new homepage design looks good on mobile"
[Human tester provides UX feedback]
Agent: "Human feedback: Design looks great, minor spacing issue noted..."

Development

Local Development

# Clone the repo
git clone https://github.com/volter-ai/runhuman.git
cd runhuman

# Install dependencies
npm install

# Build the MCP server
npm run build --workspace=@runhuman/mcp

# Run with local API
cd packages/mcp-server
node dist/index.js --api-key=qa_live_test_key_123 --api-url=http://localhost:3000

Testing

# Run tests
npm run test --workspace=@runhuman/mcp

# Use MCP Inspector (interactive testing)
npm run test:inspector --workspace=@runhuman/mcp

Project Structure

packages/mcp-server/
├── src/
│   ├── index.ts                 # CLI entry point (stdio)
│   ├── mcp-server-factory.ts    # Core server factory
│   ├── lib.ts                   # Library exports
│   ├── types.ts                 # TypeScript types
│   └── tools/                   # Tool implementations
│       ├── create-job.ts
│       └── wait-for-result.ts
├── dist/                        # Compiled output
├── tests/                       # Test suite
├── docs/                        # Developer documentation
├── package.json
├── tsconfig.json
└── README.md                    # This file

Documentation

Model Context Protocol - Learn about MCP
Runhuman Website - Product information
API Documentation - REST API reference
GitHub Repository - Full source code

Developer Docs

How Agents Use MCP - Understanding MCP tool discovery
Tool Response Best Practices - Writing effective tool responses
API Authentication - Authentication details

Integration Examples

Claude Desktop Usage

Once installed, Claude can naturally use the tools:

User: "Test my app at https://staging.myapp.com and check if the login works"

Claude:

I'll create a QA job to test your login page.
[Calls create_job tool]
Job created! Waiting for a human tester...
[Calls wait_for_result periodically]
✅ Test complete! The login page works correctly. The tester was able to log in successfully with test credentials.

Programmatic Usage (Custom MCP Client)

import { createMcpServer } from '@runhuman/mcp/factory';

const server = createMcpServer({
  mode: 'direct',
  apiUrl: 'https://runhuman.com',
  apiKey: 'qa_live_xxxxx'
});

// Use with your MCP client
await server.connect(transport);

Troubleshooting

"Error: API key is required"

Make sure you've:

Created an account at runhuman.com
Generated an API key in the dashboard
Added the key to your config with the correct format (--api-key=qa_live_...)

"Invalid API key format"

API keys must start with:

qa_live_ for production
qa_test_ for testing

Get a valid key at runhuman.com/dashboard

Connection Issues

If you're having trouble connecting:

Check your API URL is correct (default: https://runhuman.com)
Verify your network connection
Test the API directly:

curl https://runhuman.com/api/jobs \
  -H "Authorization: Bearer qa_live_xxxxx" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://example.com","description":"test","outputSchema":{}}'

Claude Desktop Not Detecting Server

Verify config file location is correct for your OS
Check JSON syntax is valid (use a JSON validator)
Restart Claude Desktop completely
Check Claude's logs for errors

Contributing

This package is part of the Runhuman monorepo. Contributions welcome!

# Clone the full repo
git clone https://github.com/volter-ai/runhuman.git
cd runhuman

# Install dependencies
npm install

# Make changes to packages/mcp-server/

# Build and test
npm run build --workspace=@runhuman/mcp
npm run test --workspace=@runhuman/mcp

# Submit PR

Support

Email: [email protected]
Issues: GitHub Issues
Website: runhuman.com

License

ISC License - See LICENSE for details

About Runhuman

Runhuman is an AI-orchestrated human QA testing service. Part of the Volter AI ecosystem.

Key Features:

On-demand human testing (no hiring/managing)
AI-powered orchestration and result extraction
Pay-per-second pricing ($0.0018/sec)
Custom output schemas for structured results
Browser automation with video/screenshot recording
GitHub Actions integration
REST API and MCP server for any workflow

Built with ❤️ by the Volter AI team

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@runhuman/mcp

What is this?

Quick Start

Claude Desktop

Other MCP Clients

Features

Available Tools

create_job

wait_for_result

Configuration

API URL (Optional)

Environment Variables

Pricing

How It Works

Example Use Cases

AI Coding Agents

CI/CD Pipelines

Visual Testing

Development

Local Development

Testing

Project Structure

Documentation

Developer Docs

Integration Examples

Claude Desktop Usage

Programmatic Usage (Custom MCP Client)

Troubleshooting

"Error: API key is required"

"Invalid API key format"

Connection Issues

Claude Desktop Not Detecting Server

Contributing

Support

License

About Runhuman

`create_job`

`wait_for_result`