@presto-ai/skills-mcp-server
v1.0.1
Published
An MCP server for dynamic skill discovery and execution.
Downloads
46
Maintainers
Readme
Skills MCP Server
A Model Context Protocol (MCP) server modeled after the Agent Skills Specification and designed to give AI agents dynamic, extensible capabilities. It allows you to manage, execute, share, and discover "skills" — discrete units of functionality (scripts, automations, tools) — directly from your file system or remote Git repositories.
🚀 Key Features
- 📝 Agent Skills Compliant: Fully supports the Agent Skills Specification via an install-time transformation layer.
- 🧠 Auto-Discovery: Intelligently infers metadata from code and scans
scripts/found in Repositories to generate executable actions. - 🏗️ Remote Execution: Installs dependencies and executes skills in a secure, isolated environment using Daytona. This ensures security and provides a consistent environment for skill execution.
- 📦 Dependency Inference: Automatically detects missing Python dependencies (imports) and installs them.
- 🔍 Vector Search: Find the right tool for the job using semantic search vs placing skill metadata in an appended system prompt. This is a win for discoverability and provides much greater flexibility and reduces the context size of the system prompt.
- 🧠 Expert Guidance: The server proactively provides tips for fixing common errors (e.g., missing dependencies, typos).
- Lifecycle Management: Full CRUD support — Create, Read, Update, Delete, and package skills.
📝 Specification Compliance
The server fully supports the Agent Skills Specification.
Compliance Matrix
| Feature | Requirement | Implementation Strategy |
| :--- | :--- | :--- |
| Directory Structure | skill/SKILL.md | ✅ Native Support |
| Discovery | Recursive scanning | ✅ Native Support |
| Resources | assets/, references/ | ✅ Auto-Discovery |
| Metadata | license, compatibility, metadata map | ✅ Full Parsing Support |
| Execution | Instruction-based | ✅ Install-Time Transformation: Scans scripts to generate executable actions automatically. |
| Sandboxing | Isolated environment | ✅ Daytona Integration |
📦 Installation
Install globally via npm:
npm install -g @presto-ai/skills-mcp-serverOr run directly with npx (no installation required):
npx @presto-ai/skills-mcp-serverPrerequisites
- Node.js (v18 or higher)
npmornpxinstalled
🔌 Configuration
Setup for MCP Clients
Add the following configuration to your MCP client's config file (e.g., claude_desktop_config.json for Claude Desktop, or the settings for Cursor/Windsurf).
Note: You must create a directory for your skills (e.g., ~/skills-storage) before running.
{
"mcpServers": {
"skills": {
"command": "npx",
"args": [
"-y",
"@presto-ai/skills-mcp-server"
],
"env": {
"SKILLS_DIRECTORY": "/Users/YOUR_USERNAME/skills-storage",
"SKILLS_OUTPUT_DIRECTORY": "/Users/YOUR_USERNAME/Documents/SkillOutput",
"DAYTONA_API_KEY": "your-daytona-api-key",
"DAYTONA_API_URL": "https://api.daytona.io",
"DAYTONA_TARGET": "us",
"OPENROUTER_API_KEY": "sk-or-...",
"GITHUB_TOKEN": "ghp_..."
}
}
}
}Environment Variables
SKILLS_DIRECTORY(Required): Absolute path to the folder where skills will be stored.DAYTONA_API_KEY(Required): API Key for Daytona remote execution. All skill execution happens in isolated cloud sandboxes for security.SKILLS_OUTPUT_DIRECTORY(Optional): Default location for downloaded files from remote execution.DAYTONA_API_URL(Optional): The API URL (default:https://api.daytona.io).DAYTONA_TARGET(Optional): The target region/provider (default:us).OPENROUTER_API_KEY(Optional): For semantic search capabilities.GITHUB_TOKEN(Optional): For accessing private repositories and publishing skills.DEFAULT_PUBLISH_REPO(Optional): Default Git repository URL for publishing skills (e.g.,https://github.com/user/my-skills).
🔒 Security Note: Local code execution has been removed entirely. All skill scripts run in isolated Daytona sandboxes, ensuring untrusted code never executes on your machine.
🛡️ Remote Execution (Daytona)
The server requires Daytona for all skill execution, ensuring complete isolation from your local machine.
[!TIP] For a deep dive into how remote execution works, including snapshot management, file synchronization, and lifecycle examples, see docs/remote-sandbox-execution.md.
- Required Setup: Add
DAYTONA_API_KEYto your environment. The server will not start without it. - Strict Isolation: All skill executions run in ephemeral cloud sandboxes. There is no local execution capability, ensuring untrusted code never runs on your machine.
- Intelligent Caching: The server automatically manages snapshots to ensure sub-second startup times for repeated executions.
- Outputs: Files and directories created by the skill are automatically downloaded to
SKILLS_OUTPUT_DIRECTORY, preserving structure. - Maintenance: To clean up lingering remote resources, run:
npx tsx scripts/cleanup_daytona.ts. See docs/daytona-troubleshooting.md for details.
- Outputs: Files and directories created by the skill are automatically downloaded to
- Automatic Lifecycle:
- Auto-Stop: Sandboxes terminate after 15 minutes of inactivity to save resources.
- Self-Healing: The server automatically cleans up orphaned sandboxes from previous sessions on startup to prevent clutter.
Manual Build (Local Development)
If you want to contribute to the server or run a local fork:
- Clone:
git clone https://github.com/jrenaldi79/skills-mcp-server.git - Install:
npm install - Build:
npm run build - Run:
node build/index.js
📚 Usage
Starting the Server
npm startConnect via your MCP client (e.g., Claude Desktop, Zed, VS Code).
Creating a Skill
You can create a skill in two ways:
- Using the Tool: Ask Claude to "Create a skill called 'weather-checker' that fetches the forecast."
- Manually:
- Create a folder:
$SKILLS_DIRECTORY/my-skill - Add a script:
script.py - (Optional) Add
SKILL.mdfor metadata. - The server will automatically discover
script.pyand expose it as a tool!
- Create a folder:
Managing Dependencies
The server handles dependencies proactively:
- Remote Sandbox: Dependencies are automatically installed in isolated Daytona sandboxes during execution.
- Auto-Inference:
- Python: Scans
.pyfiles forimportstatements and generatesrequirements.txt. - Node.js: Scans
.js/.tsfiles forrequire()/import, generatespackage.json. This enables "raw script" skills to work without manual configuration.
- Python: Scans
- Rebuild Sandbox: Use
skill_prepare --force=trueif you encounter missing package errors (rebuilds sandbox with fresh dependencies).
🏃 Execution & Code Workflow
The server supports two execution models, allowing LLMs to either use pre-built tools or write their own code.
1. Managed Skills (Function Calling)
Use this for established, reliable tools.
- Workflow: Agent calls
skill_execute("weather", "get_forecast", { city: "London" }). - Mechanism: Server runs the pre-defined
script.pywith arguments. - Safety: High (code is immutable).
2. Dynamic Execution (Sandbox)
Use this when the Agent needs to write code on the fly (e.g., creating a custom document, analyzing data).
- Workflow:
- Agent writes code strings or commands.
- Agent calls
skill_executewithexecution_mode: 'shell'. - Python (Inline):
skill_execute("utility-skill", "run", { args: ["python", "-c", "print('hello')"], execution_mode: "shell" }) - Node.js (Inline):
skill_execute("utility-skill", "run", { args: ["node", "-e", "console.log('hello')"], execution_mode: "shell" }) - TypeScript (File):
- Agent writes
script.tsto the skill directory. - Agent executes:
skill_execute("utility-skill", "run", { args: ["npx", "ts-node", "script.ts"], execution_mode: "shell" })
- Agent writes
- Mechanism: Server bypasses the skill's script and runs the command directly in the skill's environment (cwd).
- Use Case: "Create a Word document" -> Agent writes a Python script using
python-docxand executes it. "Write a TS utility" -> Agent writes code and runs it.
🧰 Available Tools for Agents
The server exposes these tools to the connected LLM:
🔍 Discovery & Search
skill_list: List all installed skills. Use sparingly - preferskill_searchfor discovery to save context window.skill_search: Find skills by query (e.g., "tool to edit PDF"). Primary method for finding tools.skill_get: Inspect the source code and instructions of a skill.skill_read_file: Read specific documentation or files within a skill's directory.
🏃 Execution
skill_execute: Run a skill's action. The server manages the child process, environment variables, and output capture.skill_prepare: Explicitly warm up a remote skill environment (create snapshot) to eliminate cold-start latency (requires Daytona).
📦 Repository & Lifecycle
skill_repo_add: Install a skill collection from GitHub (e.g.,https://github.com/user/my-skills).skill_repo_update: Pull the latest changes for a repository.skill_repo_list: View installed repositories.skill_repo_remove: Completely remove a repository and all its skills from the vendor directory.skill_delete: Remove a skill (supports both local and vendor skills).skill_package: Zip a skill for sharing.skill_install_package: Import a.skillzip file.skill_repo_push: Publish a locally-created skill to a Git repository for backup or team sharing (requiresGITHUB_TOKEN).
✏️ Authoring
skill_create: Generate a new skill scaffold.skill_setup: Update server configuration (e.g., change output folder) on the fly.
📂 Repository Structure
Skills are organized as follows:
SKILLS_DIRECTORY/
├── my-local-skill/ # Created locally
│ ├── SKILL.md
│ └── script.py
└── vendor/ # Installed from Git
└── github_user__repo/
├── requirements.txt # Shared dependencies
├── skill-one/
│ └── ...
└── skill-two/
└── ...🧪 Testing
Unit Tests
Run the unit test suite to verify core functionality:
npm testIncludes unit tests for managers, handlers, and file system operations.
Agentic E2E Tests
The project includes a novel agentic E2E test harness that validates the MCP server by having real LLMs (Claude 4.5 Sonnet and Google Gemini 3 Pro) execute multi-step tasks end-to-end.
[!TIP] For a deep dive into how the harness works, configuration options, and how to extend it, see docs/agentic-test-harness.md.
What It Tests
Unlike traditional E2E tests that simulate user actions, these tests connect actual production LLMs to the MCP server and verify they can:
- Discover and search for skills
- Create new skills dynamically with dependencies
- Execute skills in remote Daytona sandboxes
- Chain multiple skills together
- Handle errors and retry strategies
Test Scenarios
| Scenario | Description |
|----------|-------------|
| Connection Check | Verify basic LLM connectivity |
| Document Creation | Agent creates a Word document skill and uses it |
| Branded Document | Agent creates a branding skill, then uses it with a document skill |
| Presentation Creation | Agent creates and executes a PowerPoint skill |
| Skill Creation | Agent creates a custom joke-teller skill |
| Skill Execution | Agent executes the previously created skill |
Requirements
# Required environment variables in .env
OPENROUTER_API_KEY=sk-or-... # OpenRouter API key
DAYTONA_API_KEY=dtn_... # Daytona API key
DAYTONA_API_URL=https://app.daytona.io
DAYTONA_TARGET=us
SKILLS_OUTPUT_DIRECTORY=/path/to/outputRunning E2E Tests
# Run all agentic scenarios (takes ~10-15 minutes)
npm test tests/e2e/agent_scenarios.test.ts
# Run with debug logging
LOG_LEVEL=debug npm test tests/e2e/agent_scenarios.test.tsHow It Works
- Agent Runner:
AgentRunnerclass connects to OpenRouter's LLM API - Tool Discovery: Agent discovers all available MCP tools via the server
- Autonomous Execution: Agent executes a natural language goal (e.g., "Create a Word document")
- Loop Detection: Prevents infinite loops by detecting repeated tool calls
- Verification: Tests verify output files exist and tasks completed successfully
Current Status
Test Results: ~33% pass rate (4 passed, 8 failed)
The relatively low pass rate is expected and valuable — it reveals real UX challenges:
- Dependency management (python-docx, python-pptx installation)
- Parameter handling between different execution modes
- Script type detection (Python vs Bash)
These tests prove the server works end-to-end with real LLMs while exposing areas for UX improvement.
Architecture
tests/
├── e2e/
│ └── agent_scenarios.test.ts # Test scenarios
├── harness/
│ └── AgentRunner.ts # LLM integration layer
└── integration/
└── *.test.ts # Traditional integration testsThe harness is reusable and can be extended with new scenarios to validate future features.
