midscene-ios
v0.1.5
Published
Midscene.js for iOS automation
Readme
midscene-ios
iOS automation package for Midscene.js with coordinate mapping support for iOS device mirroring.
Features
- iOS Device Mirroring: Control iOS devices through screen mirroring on macOS
- Coordinate Mapping: Automatic transformation from iOS coordinates to macOS screen coordinates
- AI Integration: Use natural language to interact with iOS interfaces
- Screenshot Capture: Take region-specific screenshots of iOS mirrors
- PyAutoGUI Backend: Reliable macOS system control through Python server
Installation
npm install midscene-iosPrerequisites
Python 3 with required packages:
pip3 install flask pyautoguimacOS Accessibility Permissions:
- Go to System Preferences → Security & Privacy → Privacy → Accessibility
- Add your terminal application to the list
- Required for PyAutoGUI to control mouse and keyboard
iOS Device Mirroring:
- iPhone Mirroring (macOS Sequoia)
Quick Start
1. Start PyAutoGUI Server
cd packages/ios/idb
python3 auto_server.py 14122. Configure iOS Mirroring
First, get the mirror window coordinates using the AppleScript mentioned above, then:
import { iOSDevice, iOSAgent } from 'midscene-ios';
const device = new iOSDevice({
serverPort: 1412,
mirrorConfig: {
mirrorX: 692, // Mirror position on macOS screen
mirrorY: 161,
mirrorWidth: 344, // Mirror size on macOS screen
mirrorHeight: 764
}
});
await device.connect();
const agent = new iOSAgent(device);
// AI interactions with automatic coordinate mapping
await agent.aiTap('Settings app');
await agent.aiInput('Wi-Fi', 'Search settings');
const settings = await agent.aiQuery('string[], visible settings');3. Basic Device Control
// Direct coordinate operations
await device.tap({ left: 100, top: 200 });
await device.input('Hello', { left: 150, top: 300 });
await device.scroll({ direction: 'down', distance: 200 });
// Screenshots (automatically crops to iOS mirror region)
const screenshot = await device.screenshotBase64();API Reference
agentFromPyAutoGUI(options?)
Creates an iOS agent with PyAutoGUI backend.
Options:
serverUrl?: string- Custom server URL (default:http://localhost:1412)serverPort?: number- Server port (default:1412)autoDismissKeyboard?: boolean- Auto dismiss keyboard (not applicable for desktop)
iOSDevice Methods
launch(uri: string): Promise<iOSDevice>
Launch an application or URL.
- For URLs:
await device.launch('https://example.com') - For apps:
await device.launch('Safari')
size(): Promise<Size>
Get screen dimensions and pixel ratio.
screenshotBase64(): Promise<string>
Take a screenshot and return as base64 string.
tap(point: Point): Promise<void>
Click at the specified coordinates.
hover(point: Point): Promise<void>
Move mouse to the specified coordinates.
input(text: string): Promise<void>
Type text using the keyboard.
keyboardPress(key: string): Promise<void>
Press a specific key. Supported keys:
'Return','Enter'- Enter key'Tab'- Tab key'Space'- Space bar'Backspace'- Backspace'Delete'- Delete key'Escape'- Escape key
scroll(options: ScrollOptions): Promise<void>
Scroll in the specified direction.
ScrollOptions:
direction: 'up' | 'down' | 'left' | 'right'distance?: number- Scroll distance in pixels (default: 100)
PyAutoGUI Server API
The Python server accepts POST requests to /run with JSON payloads:
Supported Actions
Click
{
"action": "click",
"x": 100,
"y": 100
}Move (Hover)
{
"action": "move",
"x": 200,
"y": 200,
"duration": 0.2
}Drag
{
"action": "drag",
"x": 100,
"y": 100,
"x2": 200,
"y2": 200,
"duration": 0.5
}Type
{
"action": "type",
"text": "Hello World",
"interval": 0.0
}Key Press
{
"action": "key",
"key": "return"
}Hotkey Combination
{
"action": "hotkey",
"keys": ["cmd", "c"]
}Scroll
{
"action": "scroll",
"x": 400,
"y": 300,
"clicks": 3
}Sleep
{
"action": "sleep",
"seconds": 1.0
}Health Check
GET /health - Returns server status and screen information.
Architecture
┌─────────────────┐ HTTP ┌─────────────────┐ PyAutoGUI ┌─────────────────┐
│ TypeScript │ ────> │ Python Server │ ─────────> │ macOS System │
│ iOS Agent │ │ (Flask + PyAutoGUI) │ │ (Mouse/Keyboard) │
└─────────────────┘ └─────────────────┘ └─────────────────┘Troubleshooting
Accessibility Permissions
If you get permission errors, ensure your terminal has accessibility permissions:
- System Preferences → Security & Privacy → Privacy
- Select "Accessibility" from the left sidebar
- Click the lock to make changes
- Add your terminal application to the list
Python Dependencies
# Install required Python packages
pip3 install flask pyautogui
# On macOS, you might also need:
pip3 install pillowPort Already in Use
If port 1412 is already in use, specify a different port:
const agent = await agentFromPyAutoGUI({ serverPort: 1413 });Example
See examples/ios-mirroring-demo.js for a complete usage example.
License
MIT
