@presto-ai/skills-mcp-server

v1.0.1

Published

3 months ago

An MCP server for dynamic skill discovery and execution.

0High
0Medium
0Low

jrenaldi

mcp ai skills agent claude tools

Skills MCP Server

A Model Context Protocol (MCP) server modeled after the Agent Skills Specification and designed to give AI agents dynamic, extensible capabilities. It allows you to manage, execute, share, and discover "skills" — discrete units of functionality (scripts, automations, tools) — directly from your file system or remote Git repositories.

🚀 Key Features

📝 Agent Skills Compliant: Fully supports the Agent Skills Specification via an install-time transformation layer.
🧠 Auto-Discovery: Intelligently infers metadata from code and scans scripts/ found in Repositories to generate executable actions.
🏗️ Remote Execution: Installs dependencies and executes skills in a secure, isolated environment using Daytona. This ensures security and provides a consistent environment for skill execution.
📦 Dependency Inference: Automatically detects missing Python dependencies (imports) and installs them.
🔍 Vector Search: Find the right tool for the job using semantic search vs placing skill metadata in an appended system prompt. This is a win for discoverability and provides much greater flexibility and reduces the context size of the system prompt.
🧠 Expert Guidance: The server proactively provides tips for fixing common errors (e.g., missing dependencies, typos).
Lifecycle Management: Full CRUD support — Create, Read, Update, Delete, and package skills.

📝 Specification Compliance

The server fully supports the Agent Skills Specification.

Compliance Matrix

📦 Installation

Install globally via npm:

npm install -g @presto-ai/skills-mcp-server

Or run directly with npx (no installation required):

npx @presto-ai/skills-mcp-server

Prerequisites

Node.js (v18 or higher)
npm or npx installed

🔌 Configuration

Setup for MCP Clients

Add the following configuration to your MCP client's config file (e.g., claude_desktop_config.json for Claude Desktop, or the settings for Cursor/Windsurf).

Note: You must create a directory for your skills (e.g., ~/skills-storage) before running.

{
  "mcpServers": {
    "skills": {
      "command": "npx",
      "args": [
        "-y",
        "@presto-ai/skills-mcp-server"
      ],
      "env": {
        "SKILLS_DIRECTORY": "/Users/YOUR_USERNAME/skills-storage",
        "SKILLS_OUTPUT_DIRECTORY": "/Users/YOUR_USERNAME/Documents/SkillOutput",
        "DAYTONA_API_KEY": "your-daytona-api-key",
        "DAYTONA_API_URL": "https://api.daytona.io",
        "DAYTONA_TARGET": "us", 
        "OPENROUTER_API_KEY": "sk-or-...",
        "GITHUB_TOKEN": "ghp_..."
      }
    }
  }
}

Environment Variables

SKILLS_DIRECTORY (Required): Absolute path to the folder where skills will be stored.
DAYTONA_API_KEY (Required): API Key for Daytona remote execution. All skill execution happens in isolated cloud sandboxes for security.
SKILLS_OUTPUT_DIRECTORY (Optional): Default location for downloaded files from remote execution.
DAYTONA_API_URL (Optional): The API URL (default: https://api.daytona.io).
DAYTONA_TARGET (Optional): The target region/provider (default: us).
OPENROUTER_API_KEY (Optional): For semantic search capabilities.
GITHUB_TOKEN (Optional): For accessing private repositories and publishing skills.
DEFAULT_PUBLISH_REPO (Optional): Default Git repository URL for publishing skills (e.g., https://github.com/user/my-skills).

🔒 Security Note: Local code execution has been removed entirely. All skill scripts run in isolated Daytona sandboxes, ensuring untrusted code never executes on your machine.

🛡️ Remote Execution (Daytona)

The server requires Daytona for all skill execution, ensuring complete isolation from your local machine.

[!TIP] For a deep dive into how remote execution works, including snapshot management, file synchronization, and lifecycle examples, see docs/remote-sandbox-execution.md.

Required Setup: Add DAYTONA_API_KEY to your environment. The server will not start without it.
Strict Isolation: All skill executions run in ephemeral cloud sandboxes. There is no local execution capability, ensuring untrusted code never runs on your machine.
Intelligent Caching: The server automatically manages snapshots to ensure sub-second startup times for repeated executions.
- Outputs: Files and directories created by the skill are automatically downloaded to SKILLS_OUTPUT_DIRECTORY, preserving structure.
- Maintenance: To clean up lingering remote resources, run: npx tsx scripts/cleanup_daytona.ts. See docs/daytona-troubleshooting.md for details.
Automatic Lifecycle:
- Auto-Stop: Sandboxes terminate after 15 minutes of inactivity to save resources.
- Self-Healing: The server automatically cleans up orphaned sandboxes from previous sessions on startup to prevent clutter.

Manual Build (Local Development)

If you want to contribute to the server or run a local fork:

Clone: git clone https://github.com/jrenaldi79/skills-mcp-server.git
Install: npm install
Build: npm run build
Run: node build/index.js

📚 Usage

Starting the Server

npm start

Connect via your MCP client (e.g., Claude Desktop, Zed, VS Code).

Creating a Skill

You can create a skill in two ways:

Using the Tool: Ask Claude to "Create a skill called 'weather-checker' that fetches the forecast."
Manually:
- Create a folder: $SKILLS_DIRECTORY/my-skill
- Add a script: script.py
- (Optional) Add SKILL.md for metadata.
- The server will automatically discover script.py and expose it as a tool!

Managing Dependencies

The server handles dependencies proactively:

Remote Sandbox: Dependencies are automatically installed in isolated Daytona sandboxes during execution.
Auto-Inference:
- Python: Scans .py files for import statements and generates requirements.txt.
- Node.js: Scans .js/.ts files for require()/import, generates package.json. This enables "raw script" skills to work without manual configuration.
Rebuild Sandbox: Use skill_prepare --force=true if you encounter missing package errors (rebuilds sandbox with fresh dependencies).

🏃 Execution & Code Workflow

The server supports two execution models, allowing LLMs to either use pre-built tools or write their own code.

1. Managed Skills (Function Calling)

Use this for established, reliable tools.

Workflow: Agent calls skill_execute("weather", "get_forecast", { city: "London" }).
Mechanism: Server runs the pre-defined script.py with arguments.
Safety: High (code is immutable).

2. Dynamic Execution (Sandbox)

Use this when the Agent needs to write code on the fly (e.g., creating a custom document, analyzing data).

Workflow:
1. Agent writes code strings or commands.
2. Agent calls skill_execute with execution_mode: 'shell'.
3. Python (Inline): skill_execute("utility-skill", "run", { args: ["python", "-c", "print('hello')"], execution_mode: "shell" })
4. Node.js (Inline): skill_execute("utility-skill", "run", { args: ["node", "-e", "console.log('hello')"], execution_mode: "shell" })
5. TypeScript (File):
  - Agent writes script.ts to the skill directory.
  - Agent executes: skill_execute("utility-skill", "run", { args: ["npx", "ts-node", "script.ts"], execution_mode: "shell" })
Mechanism: Server bypasses the skill's script and runs the command directly in the skill's environment (cwd).
Use Case: "Create a Word document" -> Agent writes a Python script using python-docx and executes it. "Write a TS utility" -> Agent writes code and runs it.

🧰 Available Tools for Agents

The server exposes these tools to the connected LLM:

🔍 Discovery & Search

skill_list: List all installed skills. Use sparingly - prefer skill_search for discovery to save context window.
skill_search: Find skills by query (e.g., "tool to edit PDF"). Primary method for finding tools.
skill_get: Inspect the source code and instructions of a skill.
skill_read_file: Read specific documentation or files within a skill's directory.

🏃 Execution

skill_execute: Run a skill's action. The server manages the child process, environment variables, and output capture.
skill_prepare: Explicitly warm up a remote skill environment (create snapshot) to eliminate cold-start latency (requires Daytona).

📦 Repository & Lifecycle

skill_repo_add: Install a skill collection from GitHub (e.g., https://github.com/user/my-skills).
skill_repo_update: Pull the latest changes for a repository.
skill_repo_list: View installed repositories.
skill_repo_remove: Completely remove a repository and all its skills from the vendor directory.
skill_delete: Remove a skill (supports both local and vendor skills).
skill_package: Zip a skill for sharing.
skill_install_package: Import a .skill zip file.
skill_repo_push: Publish a locally-created skill to a Git repository for backup or team sharing (requires GITHUB_TOKEN).

✏️ Authoring

skill_create: Generate a new skill scaffold.
skill_setup: Update server configuration (e.g., change output folder) on the fly.

📂 Repository Structure

Skills are organized as follows:

SKILLS_DIRECTORY/
├── my-local-skill/        # Created locally
│   ├── SKILL.md
│   └── script.py
└── vendor/                # Installed from Git
    └── github_user__repo/
        ├── requirements.txt  # Shared dependencies
        ├── skill-one/
        │   └── ...
        └── skill-two/
            └── ...

🧪 Testing

Unit Tests

Run the unit test suite to verify core functionality:

npm test

Includes unit tests for managers, handlers, and file system operations.

Agentic E2E Tests

The project includes a novel agentic E2E test harness that validates the MCP server by having real LLMs (Claude 4.5 Sonnet and Google Gemini 3 Pro) execute multi-step tasks end-to-end.

[!TIP] For a deep dive into how the harness works, configuration options, and how to extend it, see docs/agentic-test-harness.md.

What It Tests

Unlike traditional E2E tests that simulate user actions, these tests connect actual production LLMs to the MCP server and verify they can:

Discover and search for skills
Create new skills dynamically with dependencies
Execute skills in remote Daytona sandboxes
Chain multiple skills together
Handle errors and retry strategies

Test Scenarios

| Scenario | Description | |----------|-------------| | Connection Check | Verify basic LLM connectivity | | Document Creation | Agent creates a Word document skill and uses it | | Branded Document | Agent creates a branding skill, then uses it with a document skill | | Presentation Creation | Agent creates and executes a PowerPoint skill | | Skill Creation | Agent creates a custom joke-teller skill | | Skill Execution | Agent executes the previously created skill |

Requirements

# Required environment variables in .env
OPENROUTER_API_KEY=sk-or-...        # OpenRouter API key
DAYTONA_API_KEY=dtn_...             # Daytona API key
DAYTONA_API_URL=https://app.daytona.io
DAYTONA_TARGET=us
SKILLS_OUTPUT_DIRECTORY=/path/to/output

Running E2E Tests

# Run all agentic scenarios (takes ~10-15 minutes)
npm test tests/e2e/agent_scenarios.test.ts

# Run with debug logging
LOG_LEVEL=debug npm test tests/e2e/agent_scenarios.test.ts

How It Works

Agent Runner: AgentRunner class connects to OpenRouter's LLM API
Tool Discovery: Agent discovers all available MCP tools via the server
Autonomous Execution: Agent executes a natural language goal (e.g., "Create a Word document")
Loop Detection: Prevents infinite loops by detecting repeated tool calls
Verification: Tests verify output files exist and tasks completed successfully

Current Status

Test Results: ~33% pass rate (4 passed, 8 failed)

The relatively low pass rate is expected and valuable — it reveals real UX challenges:

Dependency management (python-docx, python-pptx installation)
Parameter handling between different execution modes
Script type detection (Python vs Bash)

These tests prove the server works end-to-end with real LLMs while exposing areas for UX improvement.

Architecture

tests/
├── e2e/
│   └── agent_scenarios.test.ts    # Test scenarios
├── harness/
│   └── AgentRunner.ts             # LLM integration layer
└── integration/
    └── *.test.ts                  # Traditional integration tests

The harness is reusable and can be extended with new scenarios to validate future features.