@codmir/agent-mobile-sdk
v0.1.0
Published
Device-as-a-tool SDK — let AI agents control mobile apps via Socket.IO
Maintainers
Readme
@codmir/agent-mobile-sdk
Device-as-a-tool — Let AI agents control mobile apps.
The first SDK that lets an AI agent running on your server reach into a user's phone and perform actions: navigate screens, fill forms, press buttons, read content, trigger calls. The inverse of how mobile apps work today.
Install
npm install @codmir/agent-mobile-sdkQuick Start
Mobile Client (React Native)
import { MobileAgentBridge, ALL_PRESETS } from "@codmir/agent-mobile-sdk/react-native";
import { io } from "socket.io-client";
import { Alert } from "react-native";
import { router } from "expo-router";
const bridge = new MobileAgentBridge({
serverUrl: "https://your-server.com",
authToken: "user-jwt-token",
projectId: "project-123",
// User sees a prompt before dangerous actions execute
onApprovalRequired: async (request) => {
return new Promise((resolve) => {
Alert.alert(
"AI wants to act",
`${request.description}\n\n${JSON.stringify(request.params)}`,
[
{ text: "Deny", onPress: () => resolve(false) },
{ text: "Allow", onPress: () => resolve(true) },
]
);
});
},
});
// Register action handlers
bridge
.registerAction(ALL_PRESETS[0], async (params) => {
// NAVIGATE
router.push(params.route as string);
return { navigated: true };
})
.registerAction(ALL_PRESETS[1], async (params) => {
// FILL_FIELD — your app state management fills the field
return { filled: true };
})
.registerAction(ALL_PRESETS[3], async () => {
// READ_SCREEN — return current screen state
return {
route: "/project/123/tasks",
fields: [],
buttons: [{ id: "create-task", label: "Create Task", enabled: true }],
};
});
// Connect to workroom
await bridge.connect(io);Server Side (NestJS / Node.js)
import {
createMobileActionTool,
createReadScreenTool,
} from "@codmir/agent-mobile-sdk/server";
// Register as tools in your agent's tool registry
const tools = [
createMobileActionTool(),
createReadScreenTool(),
// ... your other tools
];
// When the agent calls mobile_action, forward to the connected device
socket.emit("mobile:command", {
id: crypto.randomUUID(),
type: "navigate",
params: { route: "/project/123/tasks" },
timestamp: Date.now(),
});
// Listen for results
socket.on("mobile:result", (result) => {
// Feed back to the agent's tool_use response
console.log(result.success, result.data);
});Architecture
┌──────────────────┐ Socket.IO ┌──────────────────┐
│ AI Workroom │◄──────────────────►│ Mobile App │
│ │ │ │
│ Agent calls │ mobile:command │ Bridge receives │
│ mobile_action │────────────────────►│ & executes │
│ │ │ │
│ Agent reads │ mobile:result │ Handler returns │
│ tool result │◄────────────────────│ result │
│ │ │ │
│ │ mobile:read_screen │ Screen reader │
│ │────────────────────►│ returns state │
│ │ mobile:screen_state│ │
│ │◄────────────────────│ │
└──────────────────┘ └──────────────────┘Action Types
| Action | Danger | Description |
|--------|--------|-------------|
| navigate | safe | Navigate to a screen |
| fill_field | safe | Fill a text input |
| press_button | confirm | Press a button (prompts user) |
| read_screen | safe | Read current screen state |
| scroll | safe | Scroll the view |
| show_notification | safe | Show a local notification |
| trigger_call | confirm | Join a voice room |
| set_status | safe | Update online status |
| custom | varies | App-specific actions |
Danger Levels
- safe — Executes immediately, no user prompt
- confirm — Shows approval dialog on device before executing
- dangerous — Requires explicit user confirmation with action details
Custom Actions
bridge.registerAction(
{
type: "custom",
description: "Create a new task in the current project",
danger: "confirm",
params: {
title: { type: "string", required: true },
priority: { type: "string", enum: ["low", "medium", "high"] },
},
},
async (params) => {
const task = await createTask(params.title, params.priority);
return { taskId: task.id };
}
);What Can the AI Do?
The user says to the AI workroom:
"Create a task called 'Fix login bug' with high priority in the mobile app"
The agent:
- Calls
mobile_read_screen→ sees user is on the project dashboard - Calls
mobile_action({ action: "navigate", params: { route: "/tasks/new" } }) - Calls
mobile_action({ action: "fill_field", params: { fieldId: "title", value: "Fix login bug" } }) - Calls
mobile_action({ action: "select", params: { fieldId: "priority", value: "high" } }) - Calls
mobile_action({ action: "press_button", params: { buttonId: "create" } })→ user sees approval prompt → approves - Task created.
License
Apache-2.0
