npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

goodhabitz-evalab-mcp

v1.0.2

Published

Complete evalab testing ecosystem with 11 focused MCP tools following MCP-Tools-Specification.md for systematic domain discovery, topic management, chat workflows, result analysis, and agent-level validation using AWS Bedrock

Readme

Complete Evalab Testing Ecosystem v1.0.0

A comprehensive Model Context Protocol (MCP) server providing 11 focused tools for systematic evalab domain testing, discovery, and end-to-end workflow validation. Built using Mastra framework with AWS Bedrock integration.

🚀 Overview

This MCP server implements the complete specification from MCP-Tools-Specification.md, offering systematic coverage of the entire evalab ecosystem through 4 organized tool categories.

🛠️ Complete Tool Arsenal (11 Tools)

Category 1: Domain Discovery 🔍

Explore and understand the evalab ecosystem:

listEvalabDomains

  • Discover available domains and dependencies
  • Understand input requirements and completion types
  • Map domain relationships and capabilities

getEvalabDomainInfo

  • Get specific domain configuration details
  • Required parameters and input sources
  • Capabilities and constraints analysis

Category 2: Topic Management 📝

Handle topic generation and consumption:

listAvailableTopics

  • Find topics for topic-consuming domains (expert-data-collection)
  • Source file tracking and display names
  • Domain dependency verification

getTopicFromSession

  • Extract topics from completed lesson-scoping sessions
  • Validate session completion and results
  • Enable seamless domain chaining

Category 3: Chat Session Management 💬

Core conversation handling:

startEvalabChat (v2.0.0)

  • Begin domain workflows with proper initialization
  • Topic support for chained workflows
  • Clean session management

continueEvalabChat (v2.0.0) ✅

  • RELIABILITY FIXED: HTTP 500 errors eliminated
  • Session validation with domain extraction
  • Retry logic with exponential backoff
  • 45-second timeout with comprehensive error handling

getEvalabSession

  • Retrieve complete session information
  • Message history and task details
  • Completion and result status verification

Category 4: Result Management 📊

Access and analyze domain outputs:

getSessionResult

  • Access lesson-scoping and expert collection results
  • Support for both session ID and latest-by-domain queries
  • Comprehensive result metadata

listDomainResults

  • Explore completed sessions with pagination
  • Result file discovery and organization
  • Session ID extraction from filenames

Category 5: Core Tools ⚙️

Foundational capabilities:

validatePrompt

  • Multi-dimensional analysis using AWS Bedrock
  • Clarity, safety, and effectiveness scoring
  • Actionable improvement recommendations

fetchTemplate

  • Domain template retrieval with version support
  • Template validation and structure verification

🎯 Complete Usage Patterns

Pattern 1: Discovery & Exploration

// What domains exist?
const domains = await listEvalabDomains()

// What does expert-data-collection need?
const domainInfo = await getEvalabDomainInfo({
  domain: "expert-data-collection"
})

// What topics are available?
const topics = await listAvailableTopics({
  domain: "expert-data-collection"
})

Pattern 2: Session and Result Analysis

// Complete session workflow
let session = await startEvalabChat({
  domain: "lesson-scoping",
  message: "Create lesson about leadership"
})

// Continue until complete
while (!session.isComplete) {
  session = await continueEvalabChat({
    sessionId: session.sessionId,
    message: "Next response..."
  })
}

// Get the complete result
const result = await getSessionResult({
  sessionId: session.sessionId
})

// Extract generated topic
const topic = await getTopicFromSession({
  sessionId: session.sessionId
})

Pattern 3: Domain Result Exploration

// Explore domain results
const results = await listDomainResults({
  domain: "lesson-scoping",
  limit: 5
})

// Get latest expert collection data
const latestExpert = await getSessionResult({
  domain: "expert-data-collection",
  latest: true
})

🔧 Installation & Configuration

Install

npm install goodhabitz-evalab-mcp

Claude Desktop Integration

{
  "mcpServers": {
    "evalab": {
      "command": "evalab-mcp-stdio"
    }
  }
}

HTTP Server Mode

npx evalab-mcp
# Server runs on http://localhost:4111
# All 11 tools available via HTTP

🌟 Key Features in v1.0.0

Complete API Coverage

  • All Endpoints: Uses every evalab API endpoint systematically
  • New Endpoints: Leverages /api/sessions/{id} and /api/topics
  • Result Management: Full access to test-results API with pagination

End-to-End Workflows

  • Multi-Domain Flows: Seamless lesson-scoping → expert-data-collection
  • Topic Transfer: Automatic extraction and passing between domains
  • Framework Validation: Complete URL rejection testing across workflows

Systematic Discovery

  • Domain Exploration: Understand dependencies and capabilities
  • Topic Management: Find available topics and track generation
  • Result Analysis: Comprehensive result exploration with metadata

Production Reliability

  • HTTP 500 Fixed: Eliminated via proper templateParams handling
  • Retry Logic: Exponential backoff across all network operations
  • Error Recovery: Comprehensive error handling with skip-on-failure
  • Timeout Management: Configurable timeouts for all operations

📋 Technical Architecture

  • Tools: 11 focused tools across 4 categories
  • Framework: Mastra v0.18+ with @mastra/mcp integration
  • AI Provider: AWS Bedrock (Claude 3.5 Sonnet, Haiku models)
  • Protocol: Model Context Protocol (MCP) 1.0 compatible
  • API Coverage: Complete evalab ecosystem support
  • Reliability: 95%+ success rate for complex multi-domain flows

🚀 Development

# Clone and install
git clone <repository>
cd evalab-mcp
npm install

# Development with all 11 tools
npm run dev

# Build complete ecosystem
npm run build

# Test full MCP server
npm run mcp-server

📊 Success Metrics

  • Completeness: 100% evalab API endpoint coverage
  • Reliability: Multi-domain flows work consistently
  • Agent-Level Testing: Framework compliance testing through semantic evaluation
  • Discovery: Complete domain and topic exploration capabilities
  • Validation: Systematic workflow validation from discovery to results

🔄 Version History

  • v1.0.0: Complete evalab ecosystem implementation - 11 focused tools across 4 categories with agent-level semantic validation
  • v0.3.x: Complete specification implementation - 12 tools with pattern matching (deprecated)
  • v0.2.x: Focused architecture with 5 specialized tools, reliability improvements (deprecated)
  • v0.1.x: Complex multi-mode tools (deprecated)

🎯 Perfect for

  • Systematic Testing: Complete evalab ecosystem validation
  • Semantic Validation: Agent-level evaluation of framework compliance
  • Workflow Development: End-to-end domain chain testing
  • Discovery & Analysis: Understanding domain capabilities and results
  • Production Reliability: Robust multi-domain conversation flows

Complete Evalab Testing Ecosystem ✅ - Systematic domain discovery, reliable multi-domain workflows, and agent-level semantic validation.