@cuylabs/computer-agent
v0.1.1
Published
AI agent tools for computer automation - integrates @cuylabs/computer with AI SDK and agent-core
Downloads
146
Maintainers
Readme
@cuylabs/computer-agent
AI agent for computer automation. Integrates @cuylabs/computer with @cuylabs/agent-core for a complete computer-use agent experience.
Features
- 🖥️ OpenAI Computer Use - Native support for OpenAI's
computer-use-previewmodel - 🤖 Anthropic Claude - Full support for Claude with our universal tool schema
- 🔧 Unified Tools - Same tool definitions work with both providers
- 📦 agent-core Integration - Seamless integration with agent-core's Agent class
- 🎯 Type-Safe - Full TypeScript support
Installation
npm install @cuylabs/computer-agent @cuylabs/computer @cuylabs/agent-coreProvider Dependencies
# For OpenAI
npm install openai
# For Anthropic
npm install ai @ai-sdk/anthropicQuick Start
Unified API (Recommended)
The simplest way to create computer agents - works with any provider!
import { Computer } from "@cuylabs/computer";
import { createComputerAgent } from "@cuylabs/computer-agent";
const computer = new Computer({
backend: "docker",
image: "cuylabs/desktop:latest",
});
await computer.start();
// OpenAI - uses computer_use_preview internally
const agent = await createComputerAgent({
model: "openai/gpt-4.1",
computer,
});
// OR Anthropic
const agent = await createComputerAgent({
model: "anthropic/claude-sonnet-4-20250514",
computer,
});
// Same interface for both!
for await (const event of agent.chat("session-1", "Open the file manager")) {
if (event.type === "text-delta") {
process.stdout.write(event.text);
}
}
await computer.stop();Model String Format
Use provider/model format (like LiteLLM, OpenRouter):
| Model String | Provider | Notes |
|--------------|----------|-------|
| openai/gpt-4.1 | OpenAI | Uses computer-use-preview |
| openai/computer-use-preview | OpenAI | Explicit model |
| anthropic/claude-sonnet-4-20250514 | Anthropic | Standard tools |
| gpt-4.1 | OpenAI | Inferred from name |
| claude-sonnet-4-20250514 | Anthropic | Inferred from name |
With OpenAI (Low-level)
import { Computer } from "@cuylabs/computer";
import { createAgent, createOpenAIStreamProvider } from "@cuylabs/computer-agent";
import { openai } from "@ai-sdk/openai";
const computer = new Computer({
backend: "docker",
image: "cuylabs/desktop:latest",
});
await computer.start();
const agent = createAgent({
// Model is required but not used when streamProvider handles everything
model: openai("gpt-4o"),
streamProvider: createOpenAIStreamProvider({
computer,
apiKey: process.env.OPENAI_API_KEY,
}),
});
for await (const event of agent.chat("session-1", "Open the file manager")) {
if (event.type === "text-delta") {
process.stdout.write(event.text);
}
}
await computer.stop();With Anthropic Claude
import { anthropic } from "@ai-sdk/anthropic";
import { Computer } from "@cuylabs/computer";
import { createAgent, createComputerTools } from "@cuylabs/computer-agent";
const computer = new Computer({
backend: "docker",
image: "cuylabs/desktop:latest",
});
await computer.start();
const agent = createAgent({
model: anthropic("claude-sonnet-4-20250514"),
tools: createComputerTools({ computer }),
systemPrompt: "You are a computer automation agent.",
});
for await (const event of agent.chat("session-1", "Open the file manager")) {
if (event.type === "text-delta") {
process.stdout.write(event.text);
}
}
await computer.stop();Unified Imports
This package re-exports everything from @cuylabs/agent-core:
// Single import for everything
import {
// From agent-core
createAgent,
processStream,
Tool,
// From computer-agent (unified)
createComputerAgent,
// From computer-agent (low-level)
createComputerTools,
createOpenAIStreamProvider,
createOpenAIComputerStream,
} from "@cuylabs/computer-agent";API Reference
createComputerAgent
The recommended way to create computer agents. Works with any provider!
import { createComputerAgent } from "@cuylabs/computer-agent";
const agent = await createComputerAgent({
model: "openai/gpt-4.1", // or "anthropic/claude-sonnet-4-20250514"
computer,
// Optional:
systemPrompt: "You are a helpful automation assistant.",
maxSteps: 15,
additionalTools: [...], // Extra tools (Anthropic only)
includeBash: true, // Include bash tool (default: true)
apiKey: "...", // Override env variable
display: { width: 1280, height: 800 },
debug: false,
});Options
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| model | string | required | provider/model format |
| computer | Computer | required | Computer instance |
| systemPrompt | string | default prompt | System instructions |
| maxSteps | number | 15 | Max tool iterations |
| additionalTools | Tool[] | [] | Extra tools (Anthropic only) |
| includeBash | boolean | true | Include bash tool |
| apiKey | string | env var | Override API key |
| display | {width, height} | 1280×800 | Display dimensions |
| debug | boolean | false | Enable debug logging |
createComputerTools
Creates computer tools for use with agent-core (low-level).
import { createComputerTools } from "@cuylabs/computer-agent";
const tools = createComputerTools({ computer });
// Use with createAgent
const agent = createAgent({
model: anthropic("claude-sonnet-4-20250514"),
tools,
});Supported Actions
| Action | Description | Parameters |
|--------|-------------|------------|
| screenshot | Capture current screen | - |
| click | Mouse click | x, y, button, count |
| mouse_move | Move cursor | x, y |
| drag | Drag from A to B | startX, startY, endX, endY |
| key | Press key/combo | key (e.g., "Return", "ctrl+c") |
| type | Type text | text |
| scroll | Scroll screen | direction, amount, x, y |
| wait | Wait duration | duration (seconds) |
| cursor_position | Get cursor position | - |
createOpenAIStreamProvider
Creates a stream provider for OpenAI's computer-use-preview model.
import { createOpenAIStreamProvider, createAgent } from "@cuylabs/computer-agent";
import { openai } from "@ai-sdk/openai";
const streamProvider = createOpenAIStreamProvider({
computer,
apiKey: process.env.OPENAI_API_KEY,
model: "computer-use-preview", // default
environment: "browser", // or "mac", "windows", "linux"
// Optional: safety check approval callback
onSafetyCheck: async (checks) => {
console.log("Safety checks:", checks);
return checks.map(c => c.id); // Acknowledge all
},
});
// Model is required but overridden by streamProvider
const agent = createAgent({
model: openai("gpt-4o"),
streamProvider,
});createOpenAIComputerStream
Low-level streaming for OpenAI. Returns an AI SDK-compatible stream result.
import { createOpenAIComputerStream } from "@cuylabs/computer-agent";
const stream = await createOpenAIComputerStream({
computer,
apiKey: process.env.OPENAI_API_KEY,
messages: [{ role: "user", content: "Open Firefox" }],
system: "You are a computer automation assistant.",
displayWidth: 1280,
displayHeight: 800,
});
// Consume the stream directly
for await (const chunk of stream.fullStream) {
console.log(chunk);
}Examples
See examples/ for complete working examples:
| Example | Description |
|---------|-------------|
| unified-agent.ts | Unified API - same code, any provider! ⭐ |
| openai-agent-core.ts | OpenAI with Agent wrapper |
| openai-stream.ts | Low-level OpenAI streaming |
| anthropic-agent-core.ts | Anthropic with Agent wrapper |
| anthropic-ai-sdk.ts | Direct AI SDK usage |
Run examples:
cd examples
cp .env.example .env
# Edit .env with your API keys
# Unified API (recommended!)
npx tsx examples/unified-agent.ts openai
npx tsx examples/unified-agent.ts anthropic
# Provider-specific examples
npx tsx examples/openai-agent-core.ts
npx tsx examples/anthropic-agent-core.tsArchitecture
┌─────────────────────────────────────────────────────────┐
│ @cuylabs/computer-agent │
├─────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │
│ │ OpenAI │ │ Anthropic │ │ Tools │ │
│ │ Adapter │ │ (via AI SDK)│ │ computer │ │
│ │ │ │ │ │ bash │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬─────┘ │
│ │ │ │ │
│ └──────────────────┼──────────────────┘ │
│ │ │
│ ┌────────▼────────┐ │
│ │ @cuylabs/ │ │
│ │ agent-core │ │
│ │ (Agent loop) │ │
│ └────────┬────────┘ │
│ │ │
│ ┌────────▼────────┐ │
│ │ @cuylabs/ │ │
│ │ computer │ │
│ │ (Docker/QEMU) │ │
│ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘Documentation
- Architecture - Package structure and patterns
- Examples Guide - Detailed example explanations
- Tool Schema - Why we use flat schemas
Related Packages
- @cuylabs/computer - Core computer automation (Docker, QEMU)
- @cuylabs/agent-core - Agent loop and streaming
License
MIT
