@onkernel/cu-playwright

v0.1.2

Published

7 months ago

Computer Use x Playwright SDK

0High
0Medium
0Low

raf-kernel

cat-kernel

computer-use playwright anthropic automation ai typescript

Computer Use Playwright SDK

A TypeScript SDK that combines Anthropic's Computer Use capabilities with Playwright for browser automation tasks. This SDK provides a clean, type-safe interface for automating browser interactions using Claude's computer use abilities.

Features

🤖 Simple API: Single ComputerUseAgent class for all computer use tasks
🔄 Dual Response Types: Support for both text and structured (JSON) responses
🛡️ Type Safety: Full TypeScript support with Zod schema validation
⚡ Optimized: Clean error handling and robust JSON parsing
🎯 Focused: Clean API surface with sensible defaults

Installation

npm install @onkernel/cu-playwright
# or
yarn add @onkernel/cu-playwright
# or
bun add @onkernel/cu-playwright

Quick Start

import { chromium } from 'playwright';
import { ComputerUseAgent } from '@onkernel/cu-playwright';

const browser = await chromium.launch({ headless: false });
const page = await browser.newPage();

// Navigate to Hacker News manually first
await page.goto("https://news.ycombinator.com/");

const agent = new ComputerUseAgent({
  apiKey: process.env.ANTHROPIC_API_KEY!,
  page,
});

// Simple text response
const answer = await agent.execute('Tell me the title of the top story');
console.log(answer);

await browser.close();

API Reference

`ComputerUseAgent`

The main class for computer use automation.

Constructor

new ComputerUseAgent(options: {
  apiKey: string;
  page: Page;
  model?: string;
})

Parameters:

apiKey (string): Your Anthropic API key. Get one from Anthropic Console
page (Page): Playwright page instance to control
model (string, optional): Anthropic model to use. Defaults to 'claude-sonnet-4-20250514'

Supported Models: See Anthropic's Computer Use documentation for the latest model compatibility.

`execute()` Method

async execute<T = string>(
  query: string,
  schema?: z.ZodSchema<T>,
  options?: {
    systemPromptSuffix?: string;
    thinkingBudget?: number;
  }
): Promise<T>

Parameters:

query (string): The task description for Claude to execute
schema (ZodSchema, optional): Zod schema for structured responses. When provided, the response will be validated against this schema
options (object, optional):
- systemPromptSuffix (string): Additional instructions appended to the system prompt
- thinkingBudget (number): Token budget for Claude's internal reasoning process. Default: 1024. See Extended Thinking documentation for details

Returns:

Promise<T>: When schema is provided, returns validated data of type T
Promise<string>: When no schema is provided, returns the text response

Usage Examples

Text Response

import { ComputerUseAgent } from '@onkernel/cu-playwright-ts';

// Navigate to the target page first
await page.goto("https://news.ycombinator.com/");

const agent = new ComputerUseAgent({
  apiKey: process.env.ANTHROPIC_API_KEY!,
  page,
});

const result = await agent.execute(
  'Tell me the title of the top story on this page'
);
console.log(result); // "Title of the top story"

Structured Response with Zod

import { z } from 'zod';
import { ComputerUseAgent } from '@onkernel/cu-playwright-ts';

const agent = new ComputerUseAgent({
  apiKey: process.env.ANTHROPIC_API_KEY!,
  page,
});

const HackerNewsStory = z.object({
  title: z.string(),
  points: z.number(),
  author: z.string(),
  comments: z.number(),
  url: z.string().optional(),
});

const stories = await agent.execute(
  'Get the top 5 Hacker News stories with their details',
  z.array(HackerNewsStory).max(5)
);

console.log(stories);
// [
//   {
//     title: "Example Story",
//     points: 150,
//     author: "user123",
//     comments: 42,
//     url: "https://example.com"
//   },
//   ...
// ]

Advanced Options

const result = await agent.execute(
  'Complex task requiring more thinking',
  undefined, // No schema for text response
  {
    systemPromptSuffix: 'Be extra careful with form submissions.',
    thinkingBudget: 4096, // More thinking tokens for complex tasks
  }
);

Environment Setup

Anthropic API Key: Set your API key as an environment variable:
```
export ANTHROPIC_API_KEY=your_api_key_here
```
Playwright: Install Playwright and browser dependencies:
```
npx playwright install
```

Computer Use Parameters

This SDK leverages Anthropic's Computer Use API with the following key parameters:

Model Selection

Claude 3.5 Sonnet: Best balance of speed and capability for most tasks
Claude 4 Models: Enhanced reasoning with extended thinking capabilities
Claude 3.7 Sonnet: Advanced reasoning with thinking transparency

Thinking Budget

The thinkingBudget parameter controls Claude's internal reasoning process:

1024 tokens (default): Suitable for simple tasks
4096+ tokens: Better for complex reasoning tasks
16k+ tokens: Recommended for highly complex multi-step operations

See Anthropic's Extended Thinking guide for optimization tips.

Error Handling

The SDK includes built-in error handling:

try {
  const result = await agent.execute('Your task here');
  console.log(result);
} catch (error) {
  if (error.message.includes('No response received')) {
    console.log('Agent did not receive a response from Claude');
  } else {
    console.log('Other error:', error.message);
  }
}

Best Practices

Use specific, clear instructions: "Click the red 'Submit' button" vs "click submit"
For complex tasks, break them down: Use step-by-step instructions in your query
Optimize thinking budget: Start with default (1024) and increase for complex tasks
Handle errors gracefully: Implement proper error handling for production use
Use structured responses: When you need specific data format, use Zod schemas
Test in headless: false: During development, run with visible browser to debug

Security Considerations

⚠️ Important: Computer use can interact with any visible application. Always:

Run in isolated environments (containers/VMs) for production
Avoid providing access to sensitive accounts or data
Review Claude's actions in logs before production deployment
Use allowlisted domains when possible

See Anthropic's Computer Use Security Guide for detailed security recommendations.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme