@openlibing/ds-cli
v1.4.0
Published
DolphinScheduler CLI with workflow diagnosis (假成功检测), log filtering, and AI agent skill integration (.agents/skills default).
Downloads
2,020
Maintainers
Readme
ds-cli — DolphinScheduler CLI
A command-line interface for DolphinScheduler with data query, analysis, workflow management, one-click instance diagnosis, and full MCP (Model Context Protocol) server support for AI agent integration.
Published by openlibing on npm.
Features
- Data Query: Query datasources, view table structures, execute SQL
- Data Analysis: Statistics on workflows, tasks, queues, and system health
- Workflow Management: Create, run, schedule, and monitor workflows
- Cross-Environment Workflow Sync (
ds workflow export/ds workflow import): Recursively export a workflow definition (with all SUB_WORKFLOW children) to JSON and import into another environment. Topologically ordered (leaves-first) with automatic OFFLINE→edit→ONLINE state-machine handling, DS-assigned code remapping, and SUB_WORKFLOW reference rewriting viacodeMap. Default dry-run; requires--confirmto write. - Run Sub-Workflow Tasks (
ds run start): Start any workflow or sub-workflow instance with full DS API parameters (tenantCode,scheduleTimeJSON,execType,warningGroupId). Includes search-and-run workflow (ds workflow list -s <keyword>→ds run start) and failure diagnosis (ds inspect) workflow. - Instance Diagnosis (
ds diagnose): One-click false-success detection, sub-workflow drill-down, log structure analysis — designed for DS v3.x "假成功" patterns - One-Click Failure Analysis (
ds analyze failure): Automatically locates instance, lists tasks, drills down SUB_WORKFLOW to leaf tasks, identifies failures (including false-success), summarizes error logs, and provides root cause — no need to manually chain env → project → instance → tasks → diagnose → log - End-to-End Investigation (
ds investigate): Process-oriented 4-in-1 commands covering workflow overview, single-task deep-dive, cluster health check, and Gantt timeline — designed for "ask AI to investigate" scenarios - L3 Process Tools (4 new one-click tools):
ds analyze slowness— find why a workflow instance is slowds investigate delay— diagnose data delay issuesds investigate resource— check cluster Master/Worker/queue statusds investigate task-troubleshoot— quick diagnostic for a single task
- Smart Log View (
ds log view): Tail/grep/exec-only filters, full-log pagination, no temp files - Log Throughput Analysis (
ds log throughput): Extract SeaTunnel Read/Write trends, delta rates, and average throughput from task logs - Bottleneck Detection (
ds stats bottleneck): Identify top-N workflows by total duration, failure rate, and avg/max/min breakdown - Instance Top-N (
ds instance top-n): Quickly find the longest-running workflow instances with auto-pagination - Gantt Timeline (
ds instance gantt): Visualize task execution timeline for a workflow instance - Multi-Environment: Manage multiple DolphinScheduler instances (HTTP/HTTPS) with different tokens
- AI Skills: Built-in YAML Skills for AI Agent integration (default
.agents/skills/) — including multi-step Playbook Skills for incident response, data delay, and cluster health - 4-Layer MCP Architecture (55 tools):
- L1 Atomic (~38): single-point read/write — list/get/create/update/delete
- L2 Analysis (~7):
analyze_*— failure, slowness, throughput, bottleneck - L3 Process (10):
investigate_*+analyze failure/slowness— multi-step diagnostic - L4 Skill (1):
ds_skill_scenario— free-text → Playbook - Multi-Environment Support: All tools accept an optional
envparameter - 18 Write Operations clearly marked with ⚠️ in description for AI safety
- Chinese Trigger Phrases: 5 L3 tools include natural Chinese triggers (e.g. "工作流 X 失败的原因是什么", "为什么这么慢")
- Data-Driven Recommendation Graph (
skills/recommendation-graph.yaml): 20+ edges drive the→ recommended_next:suggestions, with heuristic fallback - Scenario Matching:
ds_skill_scenariomaps free-text user descriptions to the right Playbook
- NPM Install: Installable via
npm install -g
Installation
Global Install (recommended)
npm install -g @openlibing/ds-cliLocal Install
npm install @openlibing/ds-cli
npx @openlibing/ds-cli <command>From Source
git clone https://github.com/openlibing/ds-cli.git
cd ds-cli
npm install
node bin/ds <command>Quick Start
# 1. Initialize with interactive wizard (requires IP, Protocol and Token)
ds init
# 2. Or manually add an environment
ds env add -n dev -i localhost:12345 --protocol http -t <your-token>
# 3. Add another environment (e.g. production)
ds env add -n prod -i <prod-ip>:12345 -t <prod-token>
# 4. List all environments
ds env list
# 5. Switch environment
ds env use prod
# 6. Test the connection
ds env test
# 7. Try a command
ds project listNote: No username/password/login required. Just provide IP, Protocol and Token, and you're ready to go.
For HTTPS endpoints:
ds env add -n beta -i <ip>:12345 -t <token> --protocol httpsCommands
Environment Management
ds env list- List all environmentsds env current- Show current environmentds env use <name>- Switch environment (positional arg, not-n)ds env add -n <name> -i <ip:port> -t <token>- Add environment (only IP + Token)--protocol <protocol>- Protocol (http|https, default: http)
ds env remove -n <name>- Remove environmentds env info [name]- Show environment detailsds env test [name]- Test environment connectionds env set-token -n <name> -t <token>- Update token
Authentication
ds login -t <token>- Login with token (optional, for refreshing token)ds logout- Logout
Note: ds-cli uses token-based authentication. No username/password login is required. Just provide the IP and Token when adding an environment.
Project
ds project list- List projects (auto-falls-back v3 → v2)--created-only- Only show projects created by current user
ds project create -n <name>- Create projectds project delete <code>- Delete projectds project info <code>- Show project details
Workflow
ds workflow list -p <project>- List workflows (withtaskCountcolumn)-s, --search <val>- Search by name
ds workflow create -p <project> -n <name>- Create workflowds workflow delete -p <project> -c <code>- Delete workflowds workflow info -p <project> -c <code>- Show workflow detailsds workflow release -p <project> -c <code>- Publish workflowds workflow export -p <project> -c <code> [-o <file>] [--no-subs] [-e <env>]- Export a workflow definition (recursively including all SUB_WORKFLOW children) to a JSON file-o, --output <file>- Output JSON path (default:./workflow-<code>.json)--no-subs- Only export the root, skip child sub-workflows
ds workflow import -f <file> -p <project> [--confirm] [--no-release] [-e <env>]- Import workflow definitions from a JSON file into the target environment- Default is dry-run: prints what would be UPDATED / CREATED / failed; nothing is written. Add
--confirmto actually apply changes. --no-release- After import, leave workflows OFFLINE (default releases ONLINE in topological order).- The CLI automatically handles: ONLINE→OFFLINE→edit→ONLINE state machine (DS code 50008), DS-assigned new code capture on CREATE, parent-workflow SUB_WORKFLOW reference rewriting (
codeMap), and friendly hints for common DS error codes (50003 / 50008 / 10168).
- Default is dry-run: prints what would be UPDATED / CREATED / failed; nothing is written. Add
Cross-Environment Workflow Sync Example
Sync the [workflow][raw->dm][codearts]流水线&构建任务数据获取 workflow (and all its sub-workflows) from prod to beta:
# 1. Export from prod (recursively pulls main workflow + all SUB_WORKFLOW children)
ds workflow export -p 169400948446944 -c 169579299839168 -o ./wf.json -e prod
# 2. Preview the diff against beta (dry-run, no writes)
ds workflow import -f ./wf.json -p 169400948446944 -e beta
# 3. Actually apply the changes (writes + releases ONLINE in topological order)
ds workflow import -f ./wf.json -p 169400948446944 -e beta --confirm
# Optional: leave imported workflows OFFLINE (skip ONLINE release)
ds workflow import -f ./wf.json -p 169400948446944 -e beta --confirm --no-releaseHow it works under the hood:
- Topological sort by SUB_WORKFLOW reference — leaves first, root last. Ensures every child code exists in
codeMapbefore the parent is updated. - Per-definition decision:
- If the source
codealready exists on the target → UPDATE viaPUT /workflow-definition/{code}(after OFFLINE if needed). - Otherwise → CREATE via
POST /workflow-definition. DS will assign a new code; the CLI captures it intocodeMap(sourceCode → newCode).
- If the source
- Reference rewriting: Before each UPDATE, the CLI rewrites all
taskParams.workflowDefinitionCodefields in the task definition list usingcodeMap, so parents always point at the correct target-side child code. - State machine: Workflows that are ONLINE on the target are automatically OFFLINE'd before editing (works around DS error code 50008
does not allow edit). - Final release: After all writes succeed, workflows are released ONLINE in the same topological order (leaves first). Pass
--no-releaseto skip. - Error hints: DS business error codes 50003 / 50008 / 10168 are translated into actionable suggestions.
Task
ds task list -p <project> -w <workflow>- List tasks in a workflowds task create -p <project> -w <workflow> -n <name> -t <type>- Create taskds task delete -p <project> -c <code>- Delete task
Schedule
ds schedule list -p <project>- List schedulesds schedule create -p <project> -w <code> --cron <expr>- Create scheduleds schedule online -p <project> -i <id>- Bring schedule onlineds schedule offline -p <project> -i <id>- Take schedule offline
Run / Execution
ds run start -p <project> -c <code>- Start workflowds run stop -p <project> -i <instance>- Stop workflowds run pause -p <project> -i <instance>- Pause workflowds run resume -p <project> -i <instance>- Resume workflow
Instance
ds instance list -p <project>- List workflow instances-s, --state <state>- Filter by state (SUCCESS/FAILURE/RUNNING_EXECUTION/etc.)-w, --workflow <code>- Filter by workflow definition code--search <val>- Search by instance name--start-date <date>- Filter start date--end-date <date>- Filter end date--sort <field>- Sort by: duration|startTime (client-side)--order <dir>- Sort order: asc|desc-e, --env <name>- Environment name
ds instance info -p <project> -i <id>- Show instance detailsds instance tasks -p <project> -i <id>- List task instances (with duration, pid, sub-workflow drill-down)ds instance top-n -p <project>- Show top-N longest-running workflow instances-n, --size <n>- Number of instances (default 10)-s, --state <state>- Filter by state-w, --workflow <code>- Filter by workflow definition code--start-date <date>/--end-date <date>- Date range filter-e, --env <name>- Environment name
ds instance gantt -p <project> -i <id>- Show Gantt-style task timeline-e, --env <name>- Environment name
ds instance delete -p <project> -i <id>- Delete instance
Datasource
ds datasource list- List datasourcesds datasource test -i <id>- Test connectionds datasource tables -i <id>- List tablesds datasource columns -i <id> -t <table>- List columnsds datasource create -n <name> -t <type>- Create datasourceds datasource delete -i <id>- Delete datasource
Query
ds query run -i <id> -s <sql>- Run SQL queryds query preview -i <id> -t <table>- Preview table data (10 rows)
Statistics
ds stats workflow -p <project>- Workflow statistics (total/success/fail/avg/max/min duration)--days <n>- Days to look back (default 7)-w, --workflow <code>- Filter by workflow definition code-e, --env <name>- Environment name
ds stats task -p <project>- Task statistics per workflow instance--days <n>- Days to look back (default 7)-e, --env <name>- Environment name
ds stats bottleneck -p <project>- Identify bottleneck workflows by total/avg/max duration and failure rate--days <n>- Days to look back (default 7)--top <n>- Number of top workflows to show (default 10)-e, --env <name>- Environment name
ds stats queue- Queue statistics
Monitor
ds monitor master- View master nodesds monitor worker- View worker nodesds monitor database- Check database health
Log (Enhanced, no temp files)
ds log view -t <task-instance>- View task log with smart filters--tail <n>- Show only the last N lines--grep <pattern>- Filter lines matching the regex--exec-only- Show only execution output (skip DS metadata & script source)--full- Auto-paginate to fetch the complete log (handles truncation)--no-truncate- Do not truncate long lines (default: cap at 500 chars with note)--meta- Show task metadata (state, host, pid, duration) before log--limit <n>- Limit number of lines (default 100)--skip-lines <n>- Skip first N lines-e, --env <name>- Environment name
ds log throughput -t <task-instance>- Analyze SeaTunnel throughput trends from task log- Auto-paginates the full log, extracts Read/Write row snapshots
- Computes delta trends and average rates (rows/second)
-e, --env <name>- Environment name
All log output is streamed to stdout — no temporary files are generated.
Diagnose — One-click False-Success Detection
Encapsulates the workflow-instance analysis flow discovered during the openlibing GitCode issue-processing investigation.
ds diagnose instance -p <project> -i <instanceId>- Diagnose a workflow instance- Lists state, tasks, false-success scan, SUB_WORKFLOW drill-down suggestions
--limit <n>- Max leaf tasks to inspect (default 20)-e, --env <name>- Environment name
ds diagnose task -t <task-instance>- Diagnose a single task instance- Shows metadata, false-success check, optional log dump
--show-log- Show last 30 lines--show-exec- Show only execution output--show-meta- Show full task metadata JSON-e, --env <name>- Environment name
ds diagnose status -p <project>- Spot anomalies--state <state>- Filter by state (KILL/STOP/SUCCESS/FAILURE)--workflow <code>- Filter by workflow definition code--limit <n>- Max instances to show (default 30)-e, --env <name>- Environment name- Marks "⚡快速成功(可疑)" if SUCCESS < 5min (possible false success), "🛑被中断" for STOP/KILL
False-success criteria (DS v3.x 假成功 pattern):
- DS reports
SUCCESSAND pid=0(DS plugin didn't fork a real Python process) AND- No execution output in the log
- Optionally:
duration < 5s(inconsistent with claimed SUCCESS) - Excludes: SUB_WORKFLOW/SUB_PROCESS tasks (pid=0 is normal for these)
Analyze — One-Click Failure Analysis
Use this when you need to answer "why did workflow X fail?" — it automatically chains env → project → instance → tasks → diagnose → log summary in a single call.
ds analyze failure -n <name>- One-click failure analysis- Smart Instance Search: Uses DolphinScheduler API
searchValfor server-side search first, then keyword fuzzy matching, then full pagination fallback. Auto-adds[workflow]/[job]prefix if missing from the name - SUB_WORKFLOW Drill-Down: Recursively resolves SUB_WORKFLOW/SUB_PROCESS tasks to their leaf tasks. When
subWorkflowInstanceIdis unavailable (returns-), locates the sub-instance via name keyword matching + time proximity scoring - False-Success Detection: Identifies tasks with SUCCESS state but pid=0 or duration<5s (excluding SUB_WORKFLOW tasks where pid=0 is normal)
- Log Analysis: Fetches and summarizes error/exception lines from failed and false-success tasks, with connection error handling for unreachable Worker log storage
- Root Cause Summary: Provides categorized root cause, sub-workflow drill-down map, and actionable next steps
--show-log- Show last 50 lines of failed task log--throughput- Analyze SeaTunnel throughput trends for failed tasks-e, --env <name>- Environment name
- Smart Instance Search: Uses DolphinScheduler API
Example:
# Analyze why a specific workflow instance failed in prod
ds analyze failure -n "获取构建和构建机对应关系-20260610021000015" -e prod
# Name without [workflow] prefix also works (auto-detected)
ds analyze failure -n "my-workflow-20260610" -e prod
# With log details and SeaTunnel throughput analysis
ds analyze failure -n "my-workflow-20260610" -e prod --show-log --throughputSearch Strategy (3-layer, fast → comprehensive):
- searchVal exact search: Uses full name +
[workflow]/[job]prefix variants via APIsearchValparameter (fastest, works for instances > 1000) - searchVal keyword search: Extracts keywords from the name, searches each via API, then fuzzy-matches results (handles partial names)
- Full pagination fallback: Iterates all instances page-by-page (covers edge cases)
SUB_WORKFLOW Drill-Down details:
- Recursively resolves SUB_WORKFLOW/SUB_PROCESS tasks up to depth 5
- Cycle prevention via
visitedSubIdsset - When
subWorkflowInstanceIdis missing, uses name keyword matching + start-time proximity to locate the sub-instance - Deduplicates leaf tasks by task ID
Investigate — End-to-End Process Tools
Designed for "let AI investigate this" scenarios. Each sub-command chains multiple atomic tools so a single MCP call returns a complete 360° view.
ds investigate workflow -p <project> -w <workflow>— Single workflow 360° overview- Combines definition + 7-day statistics + top-N longest instances + recent failures
--days <n>- Look-back window in days (default 7)--top-size <n>- Number of top-N instances to show (default 5)-e, --env <name>- Environment name
ds investigate task -t <task-instance>— Single task end-to-end deep-dive- Shows metadata, false-success check, error/exception log dump, throughput trend
--show-log- Include last 30 log lines--show-exec- Execution output only--throughput- SeaTunnel throughput analysis--show-meta- Full task metadata JSON-e, --env <name>- Environment name
ds investigate health-check [-e <env>]— One-shot cluster health probe- Master nodes + Worker nodes + DB health + queue stats + recent failures
- Output as a single report ready for AI summarization
ds investigate workflow-timeline -p <project> -i <instance>— Gantt timeline + critical path- Resolves SUB_WORKFLOW tasks, marks the longest path, surfaces bottlenecks
-e, --env <name>- Environment name
When to use which:
| User intent | Command |
|---|---|
| "Why is workflow X failing?" | ds analyze failure -n X (one-shot root-cause) |
| "Give me a 360° view of workflow X" | ds investigate workflow -p P -w X (overview) |
| "Drill into this task's log" | ds investigate task -t T (deep-dive) |
| "Is the cluster healthy?" | ds investigate health-check (probe) |
| "Show the timeline of instance Y" | ds investigate workflow-timeline -p P -i Y (timeline) |
Skill Management
ds skill list- List all available skillsds skill show <name>- Show full skill contentds skill playbooks- List all multi-step Playbook skills (if any)ds skill scenario "<text>"- Match a free-text scenario to a Playbook (returns matches if any)ds skill install- Install skills into agent directories (interactive, in CWD)--all- Install to the default agent (.agents/skills) without prompting-a, --agents <list>- Agent key to install for (default:agents)--target <dir:agentName>- Custom directory (e.g../.trae/skills:trae)--simulate- Preview without writing files
ds skill uninstall- Remove installed skills- Same options as
install
- Same options as
ds skill where- Show resolved directories based on current working directoryds skill summary- Show compact summary of all skills
Global Options
-h, --help- Display help-V, --version- Display version-e, --env <name>- Specify environment--json- Output in JSON format-v, --verbose- Enable verbose mode--no-color- Disable colored output-c, --config <path>- Path to config file--dry-run- Simulate execution
MCP Server (Model Context Protocol)
ds-cli doubles as a full MCP server, exposing all commands as tools for AI agents (Claude Desktop, Trae, Cursor, VS Code, Codex).
Logging: MCP server logs are automatically saved to
~/.ds/log/mcp-<timestamp>.log. Each startup creates a new log file containing server startup info and runtime errors.
Available MCP Tools (50)
Single-point tools (atomic operations) and process-oriented tools (multi-step orchestrations) are mixed below. Process-oriented tools are marked with ★/★★.
| Category | Tool | Description |
|----------|------|-------------|
| Environment | ds_env_list | List all configured environments (MCP env → ds CLI env fallback) |
| Environment | ds_env_current | Show current environment |
| Environment | ds_env_use | Switch environment |
| Environment | ds_env_add | Add MCP env (stored in ~/.config/ds-cli/mcp-envs.json) |
| Project | ds_project_list | List projects (optional env parameter) |
| Project | ds_project_info | Get project metadata (optional env parameter) |
| Project | ds_project_create | Create project (optional env parameter) |
| Project | ds_project_delete | Delete project (optional env parameter) |
| Workflow | ds_workflow_list | List workflow definitions (optional env parameter) |
| Workflow | ds_workflow_info | Get workflow details (optional env parameter) |
| Workflow | ds_workflow_create | Create workflow definition (optional env parameter) |
| Workflow | ds_workflow_release | Publish workflow (optional env parameter) |
| Workflow | ds_workflow_delete | Delete workflow definition (optional env parameter) |
| Task | ds_task_list | List tasks in a workflow (optional env parameter) |
| Task | ds_task_info | Get task definition details (optional env parameter) |
| Task | ds_task_create | Create task in a workflow (optional env parameter) |
| Task | ds_task_delete | Delete task definition (optional env parameter) |
| Schedule | ds_schedule_list | List schedules (optional env parameter) |
| Schedule | ds_schedule_create | Create schedule (optional env parameter) |
| Schedule | ds_schedule_online | Bring schedule online (optional env parameter) |
| Schedule | ds_schedule_offline | Take schedule offline (optional env parameter) |
| Run | ds_run_start | Start a workflow instance (optional env parameter) |
| Run | ds_run_stop | Stop a running instance (optional env parameter) |
| Run | ds_run_pause | Pause a running instance (optional env parameter) |
| Run | ds_run_resume | Resume a paused instance (optional env parameter) |
| Monitor | ds_monitor_master | View master node status (optional env parameter) |
| Monitor | ds_monitor_worker | View worker node status (optional env parameter) |
| Monitor | ds_monitor_database | Check database health (optional env parameter) |
| Instance | ds_instance_list | List workflow instances with sort/order/date filters (optional env parameter) |
| Instance | ds_instance_info | Get instance details (optional env parameter) |
| Instance | ds_instance_tasks | List tasks with duration/pid/sub-workflow (optional env parameter) |
| Instance | ds_instance_topn | ★ Top-N longest-running instances (optional env parameter) |
| Instance | ds_instance_gantt | Gantt-style task timeline (optional env parameter) |
| Instance | ds_instance_delete | Delete workflow instance (optional env parameter) |
| Log | ds_log_view | View task log with tail/grep/meta/full/exec-only (optional env parameter) |
| Log | ds_log_throughput | ★ SeaTunnel throughput trend analysis (optional env parameter) |
| Diagnose | ds_diagnose_instance | ★ One-click false-success detection (optional env parameter) |
| Diagnose | ds_diagnose_task | Single task diagnosis (optional env parameter) |
| Diagnose | ds_diagnose_status | Recent diagnose history with state/workflow filters (optional env parameter) |
| Analyze | ds_analyze_failure | ★★ One-click failure analysis: auto-locates instance (searchVal + keyword + pagination), drills down SUB_WORKFLOW to leaf tasks, identifies failed & false-success tasks (excluding SUB_WORKFLOW), summarizes error logs, provides root cause (optional env parameter) |
| Investigate | ds_investigate_workflow | ★★ Workflow 360°: definition + 7d stats + top-N + recent failures (optional env parameter) |
| Investigate | ds_investigate_task | ★★ Task deep-dive: meta + false-success + log dump + throughput (optional env parameter) |
| Investigate | ds_health_check | ★★ Cluster health probe: master + worker + DB + queue + recent failures (optional env parameter) |
| Investigate | ds_workflow_timeline | ★★ Gantt timeline + critical path with SUB_WORKFLOW resolution (optional env parameter) |
| Datasource | ds_datasource_list | List datasources (optional env parameter) |
| Query | ds_query_run | Execute SQL query (optional env parameter) |
| Stats | ds_stats_workflow | Workflow statistics with avg/max/min duration (optional env parameter) |
| Stats | ds_stats_task | Task statistics per workflow instance (optional env parameter) |
| Stats | ds_stats_bottleneck | ★ Bottleneck workflow detection (optional env parameter) |
| Skill | ds_skill_list | List available AI Agent skills |
| Skill | ds_skill_show | Show skill full content |
| Skill | ds_skill_scenario | Match a free-text scenario to a Playbook (returns matches if any) |
Dynamic Recommendations (recommended_next)
Every MCP tool response auto-appends a → recommended_next: block so AI agents can
chain follow-up tools without losing context. Two sources contribute:
- Explicit declaration —
defineTool(..., nextSteps)adds curated suggestions to the tool description, visible intools/list(lets the agent plan ahead). - Heuristic scan — at runtime, the server scans the tool's stdout for keywords (
FAIL,False-Success,SeaTunnel,SUB_WORKFLOW,Gantt, etc.) and injects up to 4 relevant follow-up tool calls.
Example (truncated response from ds_analyze_failure):
Root cause: SQL syntax error in task `dwd_user_active` (pid=0, no execution output)
False-Success scan: 1 task(s) flagged
Sub-Workflow drill-down: 1 sub-instance, 3 leaf task(s) inspected
→ recommended_next:
1. ds_investigate_task (because false-success detected; drill into log + throughput)
2. ds_log_view (because FAIL detected; fetch last 50 lines of error log)
3. ds_instance_gantt (because SUB_WORKFLOW drill-down; visualize the timeline)Multi-Environment Usage in MCP
All business tools (Project/Workflow/Instance/Log/Diagnose/Datasource/Query/Stats) accept an optional env parameter to specify which environment to use:
- No
envparameter: Uses the default current environment - With
envparameter: Uses the specified environment (must exist in your config)
Example usage (Chat interface):
List projects in the beta environment using ds_project_list with env=betaExample with tool call:
{
"name": "ds_project_list",
"arguments": {
"env": "beta"
}
}If you specify an environment that doesn't exist, you'll get an error message listing all available environments. Use ds_env_list to see all configured environments first.
MCP Configuration for AI Clients
All clients support an optional env field to inject MCP env credentials directly,
eliminating the need for ~/.ds/config.json.
Claude Desktop
%APPDATA%\Claude\claude_desktop_config.json (Windows)
~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
Basic (no env — falls back to ds CLI ~/.ds/config.json):
{
"mcpServers": {
"ds-cli": {
"command": "npx",
"args": ["-y", "@openlibing/ds-cli@latest", "mcp", "start"]
}
}
}With MCP env config (credentials inlined, no ~/.ds/config.json needed):
{
"mcpServers": {
"ds-cli": {
"command": "npx",
"args": ["-y", "@openlibing/ds-cli@latest", "mcp", "start"],
"env": {
"DS_MCP_ENVS": "{\"current\":\"beta\",\"envs\":{\"beta\":{\"url\":\"<BETA_IP>:12345\",\"token\":\"<YOUR_TOKEN>\",\"protocol\":\"http\"},\"dev\":{\"url\":\"<DEV_IP>:12345\",\"token\":\"<YOUR_TOKEN>\",\"protocol\":\"http\"}}}"
}
}
}
}Trae IDE
~/.trae/mcp.json
Basic:
{
"mcpServers": {
"ds-cli": {
"command": "npx",
"args": ["-y", "@openlibing/ds-cli@latest", "mcp", "start"]
}
}
}With MCP env config:
{
"mcpServers": {
"ds-cli": {
"command": "npx",
"args": ["-y", "@openlibing/ds-cli@latest", "mcp", "start"],
"env": {
"DS_MCP_ENVS": "{\"current\":\"beta\",\"envs\":{\"beta\":{\"url\":\"<BETA_IP>:12345\",\"token\":\"<YOUR_TOKEN>\",\"protocol\":\"http\"},\"dev\":{\"url\":\"<DEV_IP>:12345\",\"token\":\"<YOUR_TOKEN>\",\"protocol\":\"http\"}}}"
}
}
}
}Cursor
~/.cursor/mcp.json
Basic:
{
"mcpServers": {
"ds-cli": {
"command": "npx",
"args": ["-y", "@openlibing/ds-cli@latest", "mcp", "start"]
}
}
}With MCP env config:
{
"mcpServers": {
"ds-cli": {
"command": "npx",
"args": ["-y", "@openlibing/ds-cli@latest", "mcp", "start"],
"env": {
"DS_MCP_ENVS": "{\"current\":\"beta\",\"envs\":{\"beta\":{\"url\":\"<BETA_IP>:12345\",\"token\":\"<YOUR_TOKEN>\",\"protocol\":\"http\"},\"dev\":{\"url\":\"<DEV_IP>:12345\",\"token\":\"<YOUR_TOKEN>\",\"protocol\":\"http\"}}}"
}
}
}
}VS Code (Copilot Chat)
%APPDATA%\Code\User\mcp.json (note: uses servers key)
Basic:
{
"servers": {
"ds-cli": {
"command": "npx",
"args": ["-y", "@openlibing/ds-cli@latest", "mcp", "start"]
}
}
}With MCP env config:
{
"servers": {
"ds-cli": {
"command": "npx",
"args": ["-y", "@openlibing/ds-cli@latest", "mcp", "start"],
"env": {
"DS_MCP_ENVS": "{\"current\":\"beta\",\"envs\":{\"beta\":{\"url\":\"http://<BETA_IP>:12345\",\"token\":\"<YOUR_TOKEN>\",\"protocol\":\"http\",\"path\":\"\"},\"dev\":{\"url\":\"http://<DEV_IP>:12345\",\"token\":\"<YOUR_TOKEN>\",\"protocol\":\"http\",\"path\":\"\"}}}"
}
}
}
}Codex CLI
~/.codex/config.toml (TOML format)
Basic:
[mcp_servers.ds-cli]
command = "npx"
args = ["-y", "@openlibing/ds-cli@latest", "mcp", "start"]With MCP env config:
[mcp_servers.ds-cli]
command = "npx"
args = ["-y", "@openlibing/ds-cli@latest", "mcp", "start"]
env.DS_MCP_ENVS = "{\"current\":\"beta\",\"envs\":{\"beta\":{\"url\":\"<BETA_IP>:12345\",\"token\":\"<YOUR_TOKEN>\",\"protocol\":\"http\"},\"dev\":{\"url\":\"<DEV_IP>:12345\",\"token\":\"<YOUR_TOKEN>\",\"protocol\":\"http\"}}}"Tip: Generate these configs automatically with
ds mcp config(basic) ords mcp config --save --client <id>(auto-merge), then manually add theenvfield with your credentials.
MCP Env Config (separate from ds CLI env)
MCP server can manage its own environment config independently from ~/.ds/config.json.
This is ideal for CI/CD, shared teams, or when you don't want credentials on disk.
Config priority (high → low):
DS_MCP_ENVSenv var (JSON string)<cwd>/.ds-mcp.json(project-level)~/.config/ds-cli/mcp-envs.json(user-level)- ds CLI
~/.ds/config.json(fallback)
Config file format:
{
"current": "beta",
"envs": {
"beta": { "url": "<BETA_IP>:12345", "token": "<YOUR_TOKEN>", "protocol": "http" },
"dev": { "url": "<DEV_IP>:12345", "token": "<YOUR_TOKEN>", "protocol": "http" },
"prod": { "url": "<PROD_HOST>:12345", "token": "<YOUR_TOKEN>", "protocol": "https" }
}
}ds mcp env subcommands (full management):
# List all MCP envs
ds mcp env list
# Show current MCP env
ds mcp env current
# Switch MCP env (in-memory)
ds mcp env use beta
# Add or REPLACE an MCP env (upsert — no duplicate warning)
ds mcp env add beta -u <BETA_IP>:12345 -t <YOUR_TOKEN> -p http
# Remove an MCP env
ds mcp env remove beta
# Show raw config JSON
ds mcp env show
# Show config source priority
ds mcp env whereds mcp config — print / auto-save client configs:
# Print all client configs
ds mcp config
# Print specific client
ds mcp config --client claude-desktop
# Use global `ds` command instead of npx
ds mcp config --client trae --use-global
# Auto-merge into config file
ds mcp config --client claude-desktop --saveMCP CLI Commands
# Start MCP server (stdio mode)
ds mcp start
ds mcp # same as start
# List all MCP tools
ds mcp tools
# Show MCP env config
ds mcp env list
ds mcp env show
ds mcp env where
ds mcp env add <name> -u <url> -t <token>
ds mcp env use <name>
ds mcp env remove <name>
# Generate client configs
ds mcp config
ds mcp config --client <id>AI Skills
ds-cli ships with built-in YAML Skills that can be installed into AI agent directories.
The default install location is .agents/skills/ (the standard convention used by
Anthropic Claude Code).
For other agent layouts, pass --target <dir>:<agentName>.
Single-Point Skills (atomic commands)
| Skill | Description |
|-------|-------------|
| ds-env | Environment management |
| ds-project | Project management |
| ds-workflow | Workflow management |
| ds-workflow-sync | Cross-environment workflow definition sync (export + import with topo order, codeMap rewriting, ONLINE/OFFLINE state-machine, friendly error hints) |
| ds-task | Task management |
| ds-schedule | Schedule management |
| ds-execute | Execution control |
| ds-instance | Instance queries |
| ds-datasource | Datasource management |
| ds-query | SQL queries |
| ds-stats | Statistics |
| ds-inspect | Interactive instance analysis (env picker → instance lookup → branching by state) |
Installing Skills
Skills are installed into the current working directory (CWD) by default.
The default target is <CWD>/.agents/skills/. Run ds skill install from your project directory.
# 1. Show resolved directories
ds skill where
# 2. Interactive install: 仅需输入一个"基础目录"(相对当前项目),
# 脚本会在其下自动创建 skills/ 子目录并生成所有 SKILL.md。
# 例:输入 .agents → .agents/skills/
# 输入 my-agent → my-agent/skills/
ds skill install
# 3. Install to the default agent (.agents/skills) without prompting
ds skill install --all
# 4. Install for the default agent explicitly
ds skill install -a agents
# 5. Custom directory (any path + agent name) — use this for trae/opencode/codex
ds skill install --target './.trae/skills:trae'
ds skill install --target './.opencode/skills:opencode'
ds skill install --target './.codex/skills:codex' # also enables --user for ~/.codex/skills
# 6. Preview only (no files written)
ds skill install --simulate
# 7. Uninstall
ds skill uninstall --allProgrammatic Loading (Node.js)
const { loadAllSkills, getSkillsSummary, loadSkill } = require('@openlibing/ds-cli/src/utils/skill-loader');
// Get all skills as summary text for AI
console.log(getSkillsSummary());
// Load individual skill
const diagnoseSkill = loadSkill('ds-diagnose');
console.log(diagnoseSkill.commands);Diagnosing False-Success Issues
The false-success pattern is a DS v3.x anomaly where a task reports SUCCESS but
the Python/SQL script was never actually executed. Symptoms:
pid=0(DS plugin failed before forking)- Log contains only the script source code (the second
echoof the script body) duration < 5s(contradicting the claimed success)- Downstream tables not updated
Quick diagnosis flow:
# 1. Find the longest-running instances
ds instance top-n -p <project> -n 10
# 2. Identify bottleneck workflows
ds stats bottleneck -p <project> --days 7 --top 10
# 3. One-click diagnose
ds diagnose instance -p <project> -i 2291
# 4. Drill into a suspicious task
ds diagnose task -t 4877
# 5. Inspect the raw log (with full-log pagination)
ds log view -t 4877 --full --tail 50
# 6. Filter to just execution output
ds log view -t 4877 --exec-only
# 7. Analyze SeaTunnel throughput trends
ds log throughput -t 4877Common root causes (in order of frequency):
- Worker resource contention — multiple parallel tasks saturate a single worker
- Python interpreter missing or
decrypt-tool.jarfailed to load - Working directory permission denied —
/tmp/dolphinscheduler/exec/process/<taskId>/ - Worker group misconfiguration —
workerGroupdoesn't exist or has no workers - Task timeout too short
DolphinScheduler v3.x Compatibility
Several API paths changed in DS 3.x. ds-cli auto-falls-back from v3 to v2:
| Resource | v2 path | v3 path (used by ds-cli) |
|----------|---------|--------------------------|
| Projects | /projects/list-paging | /projects/created-and-authed |
| Workflow defs | /workflow-definition/list-paging | /workflow-definition/list |
| Workflow tasks | /task-definition/list-by-workflow | /workflow-definition/{code}/tasks |
| Workflow instances | /workflow-instances/list-paging | /workflow-instances (no suffix) |
| Log query | pageNum=... | pageNo=... (we use pageNo) |
If your DS version is 2.x, ds-cli will still work because the call paths are usually backwards-compatible, but you may see some fields default to - when a v3-only field is missing.
Configuration
Configuration is stored in ~/.ds/config.json (mode 0o600):
{
"default": "prod",
"envs": {
"dev": {
"url": "127.0.0.1:12345",
"token": "your-token-here",
"protocol": "http",
"timeout": 30000
},
"beta": {
"url": "<beta-ip>:12345",
"token": "beta-token-here",
"protocol": "https",
"timeout": 30000
},
"production": {
"url": "<prod-ip>:12345",
"token": "prod-token-here",
"protocol": "http",
"timeout": 30000
}
}
}Field semantics:
url— IP:PORT only, nohttp://orhttps://prefix (v0.4.0+). The CLI strips any prefix it finds for forward compatibility.protocol—httporhttps(lowercased on write, defaults tohttp).host— not stored on disk; built at runtime as${protocol}://${url}. This eliminates the historic inconsistency wherehost: http://...andprotocol: httpscould coexist in the same file.timeout— request timeout in ms (default30000).
Simplified usage: Only url and token are required for each environment. Add --protocol https when adding the env if you need TLS.
Legacy TOML: ds-cli auto-detects a pre-existing ~/.ds/config.toml on first load, converts it to JSON, and renames the original to config.toml.bak (one-time migration, irreversible). After that the JSON file is the only source of truth.
Architecture
ds-cli is organized as follows:
src/services/api.js— Axios-based HTTP client with token auth, retry/back-off, error mapping (DS100–DS599codes), and timeout enforcement.src/services/auth.js— Token-based auth (no username/password).src/utils/config.js—~/.ds/config.jsonI/O with mode0o600, multi-env management, legacy TOML auto-migration.src/utils/skill-loader.js— YAML skill + playbook parser withplaybook:block awareness,getAllPlaybooks()/matchPlaybook(text).src/commands/— One file per command domain (analyze,diagnose,investigate,instance,skill, …). Each command is acommandersub-command module.src/mcp-server.js— Exposes CLI commands as MCP tools via@modelcontextprotocol/sdk. SupportsnextStepsdeclaration + runtimerecommended_nextinjection.bin/ds— CLI entry, registers all command modules.skills/*.yaml— Skill + Playbook definitions consumed by AI agents (installed to.agents/skills/by default).
License
Apache-2.0
