@onkernel/cu-playwright
v0.1.2
Published
Computer Use x Playwright SDK
Readme
Computer Use Playwright SDK
A TypeScript SDK that combines Anthropic's Computer Use capabilities with Playwright for browser automation tasks. This SDK provides a clean, type-safe interface for automating browser interactions using Claude's computer use abilities.
Features
- 🤖 Simple API: Single
ComputerUseAgentclass for all computer use tasks - 🔄 Dual Response Types: Support for both text and structured (JSON) responses
- 🛡️ Type Safety: Full TypeScript support with Zod schema validation
- ⚡ Optimized: Clean error handling and robust JSON parsing
- 🎯 Focused: Clean API surface with sensible defaults
Installation
npm install @onkernel/cu-playwright
# or
yarn add @onkernel/cu-playwright
# or
bun add @onkernel/cu-playwrightQuick Start
import { chromium } from 'playwright';
import { ComputerUseAgent } from '@onkernel/cu-playwright';
const browser = await chromium.launch({ headless: false });
const page = await browser.newPage();
// Navigate to Hacker News manually first
await page.goto("https://news.ycombinator.com/");
const agent = new ComputerUseAgent({
apiKey: process.env.ANTHROPIC_API_KEY!,
page,
});
// Simple text response
const answer = await agent.execute('Tell me the title of the top story');
console.log(answer);
await browser.close();API Reference
ComputerUseAgent
The main class for computer use automation.
Constructor
new ComputerUseAgent(options: {
apiKey: string;
page: Page;
model?: string;
})Parameters:
apiKey(string): Your Anthropic API key. Get one from Anthropic Consolepage(Page): Playwright page instance to controlmodel(string, optional): Anthropic model to use. Defaults to'claude-sonnet-4-20250514'
Supported Models: See Anthropic's Computer Use documentation for the latest model compatibility.
execute() Method
async execute<T = string>(
query: string,
schema?: z.ZodSchema<T>,
options?: {
systemPromptSuffix?: string;
thinkingBudget?: number;
}
): Promise<T>Parameters:
query(string): The task description for Claude to executeschema(ZodSchema, optional): Zod schema for structured responses. When provided, the response will be validated against this schemaoptions(object, optional):systemPromptSuffix(string): Additional instructions appended to the system promptthinkingBudget(number): Token budget for Claude's internal reasoning process. Default:1024. See Extended Thinking documentation for details
Returns:
Promise<T>: Whenschemais provided, returns validated data of typeTPromise<string>: When noschemais provided, returns the text response
Usage Examples
Text Response
import { ComputerUseAgent } from '@onkernel/cu-playwright-ts';
// Navigate to the target page first
await page.goto("https://news.ycombinator.com/");
const agent = new ComputerUseAgent({
apiKey: process.env.ANTHROPIC_API_KEY!,
page,
});
const result = await agent.execute(
'Tell me the title of the top story on this page'
);
console.log(result); // "Title of the top story"Structured Response with Zod
import { z } from 'zod';
import { ComputerUseAgent } from '@onkernel/cu-playwright-ts';
const agent = new ComputerUseAgent({
apiKey: process.env.ANTHROPIC_API_KEY!,
page,
});
const HackerNewsStory = z.object({
title: z.string(),
points: z.number(),
author: z.string(),
comments: z.number(),
url: z.string().optional(),
});
const stories = await agent.execute(
'Get the top 5 Hacker News stories with their details',
z.array(HackerNewsStory).max(5)
);
console.log(stories);
// [
// {
// title: "Example Story",
// points: 150,
// author: "user123",
// comments: 42,
// url: "https://example.com"
// },
// ...
// ]Advanced Options
const result = await agent.execute(
'Complex task requiring more thinking',
undefined, // No schema for text response
{
systemPromptSuffix: 'Be extra careful with form submissions.',
thinkingBudget: 4096, // More thinking tokens for complex tasks
}
);Environment Setup
Anthropic API Key: Set your API key as an environment variable:
export ANTHROPIC_API_KEY=your_api_key_herePlaywright: Install Playwright and browser dependencies:
npx playwright install
Computer Use Parameters
This SDK leverages Anthropic's Computer Use API with the following key parameters:
Model Selection
- Claude 3.5 Sonnet: Best balance of speed and capability for most tasks
- Claude 4 Models: Enhanced reasoning with extended thinking capabilities
- Claude 3.7 Sonnet: Advanced reasoning with thinking transparency
Thinking Budget
The thinkingBudget parameter controls Claude's internal reasoning process:
- 1024 tokens (default): Suitable for simple tasks
- 4096+ tokens: Better for complex reasoning tasks
- 16k+ tokens: Recommended for highly complex multi-step operations
See Anthropic's Extended Thinking guide for optimization tips.
Error Handling
The SDK includes built-in error handling:
try {
const result = await agent.execute('Your task here');
console.log(result);
} catch (error) {
if (error.message.includes('No response received')) {
console.log('Agent did not receive a response from Claude');
} else {
console.log('Other error:', error.message);
}
}Best Practices
Use specific, clear instructions: "Click the red 'Submit' button" vs "click submit"
For complex tasks, break them down: Use step-by-step instructions in your query
Optimize thinking budget: Start with default (1024) and increase for complex tasks
Handle errors gracefully: Implement proper error handling for production use
Use structured responses: When you need specific data format, use Zod schemas
Test in headless: false: During development, run with visible browser to debug
Security Considerations
⚠️ Important: Computer use can interact with any visible application. Always:
- Run in isolated environments (containers/VMs) for production
- Avoid providing access to sensitive accounts or data
- Review Claude's actions in logs before production deployment
- Use allowlisted domains when possible
See Anthropic's Computer Use Security Guide for detailed security recommendations.
Requirements
- Node.js 18+
- TypeScript 5+
- Playwright 1.52+
- Anthropic API key
Related Resources
- Anthropic Computer Use Documentation
- Extended Thinking Guide
- Playwright Documentation
- Zod Documentation
License
See License
