demosmith-mcp
v0.1.1
Published
MCP server for automated demo recording with video, documentation, and screenshot generation
Maintainers
Readme
demosmith-mcp
An MCP (Model Context Protocol) server for automated demo recording with video, documentation, and screenshot generation. Perfect for creating product demos, tutorials, and documentation with AI agents.
Demo

Demo: GitHub login flow with animated cursor, click effects, and auto-generated documentation
Features
- Video Recording - Automatic screen recording of browser sessions
- Screenshot Capture - Automatic screenshots at each step
- Animated Cursor - Smooth cursor animations with click effects and sounds
- TTS Narration - AI-powered voiceover with multiple providers (OpenAI, ElevenLabs, Azure, Edge)
- Multiple Output Formats:
- Video (WebM)
- Video with Audio (MP4)
- Playwright Trace (interactive replay)
- Markdown Guide
- JSON Steps
- Narration Script + JSON (with timestamps)
- Subtitles (SRT/VTT)
- Interactive HTML Tutorial
- GIF Preview
- Multi-language Support - English and Chinese
- Multi-tab Support - Work with multiple browser tabs
- Flexible Element Selection - By ref, text, label, placeholder, CSS, XPath
Installation
npm install demosmith-mcp
npx playwright install chromiumUsage
As MCP Server
Add to your Claude Code MCP configuration (~/.claude/mcp.json):
{
"mcpServers": {
"demosmith": {
"command": "npx",
"args": ["demosmith-mcp"]
}
}
}CLI Mode
# Replay a recorded demo
demosmith replay ./steps.json -o ./output --video
# Generate documentation from steps
demosmith generate ./steps.json -l zh -o ./docs
# Serve generated files locally
demosmith serve ./outputMCP Tools
Session Management
| Tool | Description |
|------|-------------|
| demosmith_start | Start a new demo recording session |
| demosmith_end | End session and generate all deliverables |
| demosmith_status | Get current session status |
Navigation & Discovery
| Tool | Description |
|------|-------------|
| demosmith_navigate | Navigate to a URL |
| demosmith_snapshot | Get accessibility tree snapshot for element refs |
Core Actions
| Tool | Description |
|------|-------------|
| demosmith_click | Click an element (with animated cursor) |
| demosmith_fill | Fill a text input (with typing animation) |
| demosmith_select | Select from dropdown |
| demosmith_press_key | Press keyboard key or combination |
| demosmith_hover | Hover over element (for tooltips/menus) |
| demosmith_drag | Drag and drop |
| demosmith_upload | Upload file |
Page Actions
| Tool | Description |
|------|-------------|
| demosmith_scroll | Scroll page or element |
| demosmith_wait | Wait for condition |
| demosmith_screenshot | Take manual screenshot |
Verification
| Tool | Description |
|------|-------------|
| demosmith_assert | Verify conditions (text, visibility, URL, etc.) |
Tab Management
| Tool | Description |
|------|-------------|
| demosmith_new_tab | Open new browser tab |
| demosmith_switch_tab | Switch to different tab |
| demosmith_close_tab | Close a tab |
| demosmith_list_tabs | List all open tabs |
Element Selectors
demosmith supports multiple ways to locate elements:
# By ref (from snapshot)
"1", "2", "3"
# By visible text
"text:Submit"
"text:/Submit|Cancel/" # regex
# By label
"label:Email"
# By placeholder
"placeholder:Enter your name"
# By role and name
"role:button:Submit"
"role:textbox"
# By test ID
"testid:submit-btn"
# By CSS selector
"css:.btn-primary"
# By XPath
"xpath://button[@type='submit']"
# By alt text
"alt:Logo"
# By title
"title:Close"Example Workflow
1. demosmith_start(url="https://example.com/login", title="Login Demo")
2. demosmith_snapshot() → Get element refs
3. demosmith_fill(ref="label:Email", value="[email protected]", description="Enter email")
4. demosmith_fill(ref="label:Password", value="password123", description="Enter password")
5. demosmith_click(ref="text:Sign In", description="Click sign in button")
6. demosmith_assert(type="url", expected="/dashboard", description="Verify redirect")
7. demosmith_end() → Returns all deliverablesOutput Files
After ending a session, the following files are generated:
output/
├── demo.webm # Screen recording video
├── demo-with-audio.mp4 # Video with TTS narration (if TTS enabled)
├── demo.gif # Animated GIF preview
├── trace.zip # Playwright trace (interactive replay)
├── guide.md # Markdown documentation
├── steps.json # Structured step data
├── narration.txt # Voiceover script
├── narration.json # Timed narration for TTS APIs
├── narration.mp3 # Generated audio (if TTS enabled)
├── subtitles.srt # SRT subtitles
├── subtitles.vtt # VTT subtitles
├── tutorial.html # Interactive HTML tutorial
├── animated-preview.html # HTML preview (fallback)
└── assets/
├── step-001.png
├── step-002.png
└── ...See examples/github-login-demo/ for a complete example output.
Configuration Options
Start Session Options
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| title | string | required | Demo title |
| startUrl | string | required | Starting URL |
| outputDir | string | temp dir | Output directory |
| video | boolean | true | Record video |
| trace | boolean | true | Record Playwright trace |
| screenshotOnStep | boolean | true | Auto-screenshot each step |
| headless | boolean | false | Run browser headless |
| viewport | object | 1280x720 | Browser viewport size |
| storageState | string | - | Path to login state file |
Animation Options
Click and fill actions support animation options:
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| animated | boolean | true | Enable cursor animation |
| moveDuration | number | 500 | Cursor movement duration (ms) |
| typeDelay | number | 50 | Delay between keystrokes (ms) |
Assert Types
The demosmith_assert tool supports these verification types:
| Type | Description |
|------|-------------|
| text | Check element text content |
| visible | Check element is visible |
| hidden | Check element is hidden |
| url | Check current URL |
| title | Check page title |
| value | Check input value |
| checked | Check checkbox is checked |
| enabled | Check element is enabled |
| disabled | Check element is disabled |
| count | Check number of matching elements |
Multi-language Support
Generated content supports English and Chinese. Set via CLI:
demosmith generate ./steps.json -l zh # Chinese
demosmith generate ./steps.json -l en # English (default)Custom Templates
You can provide custom templates for output generation using Mustache-like syntax:
# {{session.title}}
{{#each steps}}
## Step {{this.id}}: {{this.description}}
{{#if this.screenshotRelative}}

{{/if}}
{{/each}}Login Session Support
Save a login session with Playwright:
await context.storageState({ path: 'auth.json' });Use in demo:
demosmith_start(url="...", title="...", storageState="auth.json")TTS Narration
Generate AI voiceover for your demos by passing TTS options to demosmith_end:
demosmith_end(tts={
provider: "openai",
apiKey: "sk-...",
voice: "alloy"
})Supported TTS Providers
| Provider | API Key Required | Voices | Notes |
|----------|-----------------|--------|-------|
| openai | Yes | alloy, echo, fable, onyx, nova, shimmer | Best quality |
| elevenlabs | Yes | Various voice IDs | Most natural |
| azure | Yes | en-US-JennyNeural, etc. | SSML support |
| edge | No | en-US-AriaNeural, etc. | Free, requires edge-tts CLI |
TTS Options
| Option | Type | Description |
|--------|------|-------------|
| provider | string | TTS provider (openai, elevenlabs, azure, edge) |
| apiKey | string | API key (not needed for edge) |
| voice | string | Voice ID or name |
| language | string | Language code (e.g., en-US, zh-CN) |
| speed | number | Speech speed multiplier |
Environment Variables
For Azure TTS, set the region:
export AZURE_SPEECH_REGION=eastusNarration JSON Format
The generated narration.json contains timed segments for custom TTS integration:
{
"title": "Login Demo",
"totalDurationMs": 15000,
"segments": [
{
"stepId": 1,
"startMs": 2000,
"endMs": 4500,
"durationMs": 2500,
"text": "Click the login button"
}
]
}Development
# Install dependencies
pnpm install
# Build
pnpm build
# Run MCP server
pnpm start
# Run CLI
pnpm cli helpLicense
MIT
