@uipath/servo
v26.4.17562447-alpha
Published
UiPath Servo - UI automation CLI
Maintainers
Keywords
Readme
uipath-servo
Desktop and browser UI automation CLI. Inspect UI trees, interact with elements, and extract structured data — from the terminal or through coding agents.
Key Features
- Desktop + Web — automate native desktop applications (Win32, WPF, UWP, WinUI, Java, Qt, SAP) and browser tabs with one tool
- Token-efficient — structured snapshots with element refs, no page dumps into LLM context
- Multiple input methods — hardware events, control APIs, browser debugger protocol, Win32 messages
Requirements
- Windows (x64)
- Claude Code, Cursor, Gemini CLI, or any other coding agent
Installation
npm install -g @uipath/servo@latest
servo helpAgent Skills
Install agent skills from https://github.com/UiPath/skills so your agent knows all available commands.
Without skills, point your agent at the CLI and let it figure things out from servo help:
Test the UI of my Calculator app using servo.
Check "servo help" for available commands.Getting Started
# List windows and browser tabs
servo targets
# Snapshot a window to get its UI tree with element refs
servo snapshot w1
# Interact using refs from the snapshot
servo click e5
servo type e3 "hello"
servo select e8 "Option B"
# Re-snapshot after UI changes — refs may be stale
servo snapshot w1
# Take a screenshot
servo screenshot w1The server starts automatically on first command — no setup needed.
Demo
Your agent will run these commands autonomously, but you can also use them interactively:
servo targets
servo snapshot w1
servo click e12 # Click a menu item
servo snapshot w1 # Refresh the tree
servo type e5 "Hello, World!" --clear-before
servo screenshot w1Commands
All commands accept --timeout (default 30s) and --visualize (shows a visual indicator of the target element). Commands that write output files accept --filename <path> to override the output file path. Everything is case-sensitive.
Discover
servo targets # List windows (w-refs) and browser tabs (b-refs)
servo targets --exclude Windows # Browser tabs only
servo snapshot w1 # Capture UI tree with element refs (e-refs)
servo snapshot b1 # Snapshot a browser tab
servo snapshot w1 --framework UiaOnly # Use specific UI frameworkInteract
servo click e5 # Click (left, single)
servo click e5 --button Right # Right-click
servo click e5 --type Double # Double-click
servo click e5 -i ControlApi # Click via element API (background windows)
servo click e5 --offset "10,-5" # Click with pixel offset from center (default origin)
servo click e5 --origin TopLeft --offset "5,-10" # Origin: Center, TopLeft, TopRight, BottomLeft, BottomRight, TopCenter, BottomCenter, LeftCenter, RightCenter
servo click e5 --modifiers Ctrl # Ctrl+click (e.g. multi-select in lists)
servo click e5 -m "Ctrl,Shift" # Ctrl+Shift+click
servo hover e4 # Hover over element
servo type e3 "some text" # Type into element
servo type e3 "text" --clear-before # Clear field, then type (implies --click-before)
servo type e3 "text" --click-before # Click field before typing (auto-enabled for HardwareEvents)
servo type e3 "text" -i ControlApi # Use ControlApi (may auto-clear)
servo type e3 "text" -i WebBrowserDebugger # Use browser debugger protocol (recommended for Chrome/Edge)
servo type e3 "[d(ctrl)]a[u(ctrl)]" # Select all (Ctrl+A)
servo select e8 "Option B" # Select item from dropdown/list
servo wheel e5 --direction Down -c 10 # Scroll down 10 clicks
servo focus e5 # Bring element into view and focus
servo window w2 BringToForeground # Bring window to front
servo window w1 Maximize # Maximize, Minimize, Restore, Close, Hide, ShowInspect
servo get e5 text # Read a single attribute
servo get-all e5 # Read all attributes
servo get-attribute-names e5 # List available attribute names
servo screenshot # Full desktop screenshot
servo screenshot w1 # Window screenshot
servo screenshot e5 # Element screenshot
servo extract-table e5 # Extract table data as markdown
servo highlight e5 # Draw red border (3s)
servo highlight e5 --color Blue --duration 5
servo selector e5 # Get UiPath selector stringManage
servo server --start # Start server manually
servo server --stop # Stop server
servo server --status # Check server status
servo server --start -s sess1 # Start server for specific session
servo server --stop -s sess1 # Stop server for specific session
servo server --kill-all # Kill all server processes (last resort)
servo clean # Delete output filesCommands that produce output (targets, snapshot, screenshot, etc.) print a summary to the console and write details to a file:
servo targets
### Top-level targets (11 windows, 2 browser tabs)
- [Targets](.servo/output/targets-2026-07-19T12-10-01-002.yml)Read the linked .servo/output/ file to see the full results.
Snapshot Format
Snapshots show the UI accessibility tree. Each node has:
- Role — Element type:
Button,InputBox,CheckBox,DropDown,List,TreeItem,TabPage,MenuItem, etc. - "Name" — Accessible label in quotes (e.g.,
Button "OK") - [ref=eN] — Element reference for interaction. Use this ref in commands.
- [state] — State markers:
[selected],[focused],[disabled],[read only] - : text — Inline value (e.g.,
InputBox [ref=e3]: pre-filled) - /attr — Attributes as child lines (e.g.,
/url: https://...,/placeholder: Type here) - Children — Nested with indentation
Example:
- DropDown [ref=e73]: Second
- Option "-- Choose --"
- Option "First"
- Option "Second" [selected]
- Option "Third (disabled)"
- InputBox "Username" [ref=e5]: john_doe
- /placeholder: Enter username
- CheckBox "Remember me" [ref=e6] [selected]
- Button "Submit" [ref=e7]
- Button "Cancel" [ref=e8] [disabled]Key rules:
- Only elements with
[ref=eN]are directly interactable. Option children without refs are shown for context — useservo selecton the parent DropDown's ref instead. [disabled]elements cannot be interacted with — skip them.[selected]on CheckBox/RadioButton means checked; on TabPage/ListItem means active.- Snapshots may not show all text or attribute values. Use
servo get <e> textorservo get-all <e>to read values, orservo extract-table <e>for table data.
Ref Lifecycle
Window refs (w1, w2) and browser refs (b1, b2) are assigned by servo targets. They reset on each servo targets call.
Element refs (e1, e2, e3...) are assigned by servo snapshot. They reset on each servo snapshot call.
b-refs target browser tabs; w-refs target windows. See Application Guides > Browsers for details.
Always re-snapshot after actions that change UI state. Clicking a button, selecting a tab, or typing may alter the UI tree, making previous e-refs invalid.
servo snapshot w1 # Get refs
servo click e5 # Perform action
servo snapshot w1 # Get fresh refs — old ones are stale
servo type e3 "hello" # Use new refsFrameworks
Use --framework with servo snapshot to control how the UI tree is scanned.
Default auto-detects the best technology for most apps. Use it unless specified otherwise.
Use --framework UiaOnly for:
- WinUI3 apps — modern Windows apps like Windows Terminal, and the redesigned Notepad, Paint, Calculator, Media Player
- WPF apps — .NET desktop apps with rich UI like Visual Studio, Blend, or any app built with XAML
- SAP Logon — the connection picker window shown when SAP GUI first opens
If a snapshot looks empty or incomplete, try a different framework.
Input Methods
Use --input-method (-i) with click, type, and hover:
- HardwareEvents (default) — Simulates real mouse/keyboard. Auto-activates the window (foreground required). Typing appends to existing text, use
--clear-beforeto clear first. - ControlApi — Uses the element's native API directly. Works on background windows. Usually auto-clears the field before typing, so do not use
--clear-beforeby default. Verify the result withservo get,servo get-all, or re-snapshot — only retry with--clear-beforeif the field contains unexpected text. Recommended for Firefox (only in b-ref mode), Java Swing/AWT apps, and SAP WinGUI session windows. - WebBrowserDebugger — Dispatches via Chromium Debugger. Recommended for Chrome/Edge. Does not require foreground.
Switch input methods when the default has no visible effect on the target element.
Special keys in servo type and --modifiers in servo click are fully supported with HardwareEvents and WebBrowserDebugger. ControlApi may support special keys for some applications (e.g.: Browsers (b-refs) and SAP session windows), but this is not guaranteed — other input methods may silently ignore them.
Special Keys
servo type supports special key syntax:
[k(key)]— Press and release (e.g.,[k(enter)],[k(tab)])[d(key)]— Hold down (e.g.,[d(ctrl)],[d(shift)])[u(key)]— Release (e.g.,[u(ctrl)],[u(shift)])
servo type e3 "[d(ctrl)]a[u(ctrl)]" # Select all
servo type e3 "[d(ctrl)]c[u(ctrl)]" # Copy
servo type e3 "[k(enter)]" # Press Enter
servo type e3 "Line 1[k(enter)]Line 2" # Type multiline
servo type e3 "[[k(enter)]" # Type literal "[k(enter)]"UiPath Selectors
servo selector generates UiPath selector strings compatible with UiPath UIAutomation:
servo selector e5
<wnd app='notepad.exe' title='Untitled - Notepad' /><ctrl name='Text Editor' role='document' />Sessions
Use --session (-s) to run isolated servo instances:
servo targets -s sess1
servo snapshot w1 -s sess1
servo click e5 -s sess1Each session has its own server, refs, and state.
Cleanup: When automation is done, stop all servers — both named and default sessions:
servo server --stop # Stop default session
servo server --stop -s sess1 # Stop named sessionCommon Patterns
Select from dropdown/list
DropDown and List elements show options as children. Use the parent's ref and the option's name:
servo select e10 "Blue" # Select by option nameThe current selection is shown as inline text after : or as a child marked [selected]. Re-snapshot to confirm.
If options are missing, click the element to expand it and re-snapshot — some load children only when opened.
Fill a form
servo targets # Find the window
servo snapshot w1
servo type e5 "John Doe" --clear-before
servo type e6 "[email protected]" --clear-before
servo select e8 "USA"
servo click e10 # Submit
servo snapshot w1 # VerifyNavigate a menu
servo click e13 # Click "File"
servo snapshot w1 # See submenu
servo click e42 # Click submenu itemToggle a checkbox
servo click e6
servo snapshot w1 # [selected] = checkedExpand a tree node
servo click e108 --type Double # Double-click to expand tree item
servo snapshot w1 # Snapshot to see childrenSwitch tabs
servo click e115 # Click the tab you want
servo snapshot w1 # Verify tab is now [selected] and content changedApplication Guides
Browsers
Prerequisites
- UiPath browser extensions must be installed for b-refs to appear in
servo targets - Install: https://docs.uipath.com/studio/standalone/latest/user-guide/about-extensions
Targeting
Use b-refs (not w-refs) for web content — b-refs provide the DOM tree.
servo targets
# - Window "My Page - Google Chrome" [ref=w1]
# - Browser "My Page" [ref=b1]
servo snapshot b1 # Preferred: snapshot the browser tab- Internal browser pages (new tab, settings, bookmarks, downloads etc.) are NOT available as b-refs — the browser extension cannot access these pages, so they will only appear as w-refs
- Discarded tabs show
[discarded]state — servo attempts to activate them automatically before interaction - After page navigation, re-snapshot to get fresh refs
Input Method Selection
Chromium (Chrome, Edge): Use -i WebBrowserDebugger — more reliable than HardwareEvents for web elements. Fallback: ControlApi, then HardwareEvents.
Firefox: Use -i ControlApi — WebBrowserDebugger is not supported on Firefox. Fallback: HardwareEvents.
SAP WinGUI
Prerequisites
- For best results, SAP GUI Scripting should be enabled (server and client).
- Setup guide: https://docs.uipath.com/activities/other/latest/ui-automation/sap-wingui-configuration-steps
Framework Selection
- SAP Logon (connection picker) ->
--framework UiaOnly - All other SAP GUI windows (after connecting) ->
--framework Default(or omit)
servo snapshot w1 --framework UiaOnly # SAP Logon window
servo snapshot w1 # SAP session window (Default)Transaction Code Navigation
servo snapshot w1 # Get ref for the command field
servo type e1 -i ControlApi "/nVA01[k(enter)]" # Navigate to transaction VA01
servo snapshot w1 # New transaction screen loaded with new refsReading Table Data
Snapshots only show rows currently in view. Maximize first for more rows:
servo window w1 Maximize
servo snapshot w1Use extract-table to get all rows automatically:
servo extract-table e15 --timeout 120 # Full table (increase timeout for large tables)To interact with rows not in view, scroll and re-snapshot:
servo wheel e15 --direction Down -c 5 # Scroll down
servo snapshot w1 # Fresh refs for newly visible rowsStatus Bar Messages
SAP confirms operations via the status bar at the bottom. After an action:
servo get e99 text # Read status bar (ref varies)SAP Tips
- Use longer timeouts for SAP operations:
--timeout 60 - Check status bar messages to confirm operations
- SAP tables only expose visible rows in snapshots — use
extract-tablefor full data
Error Recovery
Possible misconfiguration: If you suspect an app is not properly configured for automation (e.g., missing extension, scripting disabled), check the relevant Application Guide above.
Empty/partial snapshot — Wrong framework or window not ready:
servo window w1 BringToForeground # Ensure window is visible
servo window w1 Maximize # Maximize to see all elements
servo snapshot w1 --framework UiaOnly # Try different frameworkDropdown/list options not visible — Click to expand, then re-snapshot:
servo click e10 # Click the DropDown to expand it
servo snapshot w1 # Re-snapshot to see the optionsInteraction has no visible effect:
servo get-all e5 # Check element attributes for clues
servo click e5 -i ControlApi # Try a different input methodClick lands on wrong spot — The click feedback shows screen coordinates. Take a screenshot and check whether those coordinates are inside the intended element. If not:
servo click e5 --origin TopLeft --offset "5,5" # Use a different origin point instead of center
servo click e5 -i ControlApi # Might ignore the coordinates
servo snapshot w1 --framework UiaOnly # Re-snapshot with different framework (bounds may differ)
servo click e10 # Click a child element that may be more reliably locatedConnection error — Server in a bad state:
servo server --kill-all # Kill all servers, then retry
servo targets # Reconnects automaticallyTelemetry
Servo collects anonymous telemetry. To disable it, set the environment variable:
UIPATH_SERVO_TELEMETRY=0The server reads this variable at startup, so restart it after changing the setting:
servo server --kill-all