stagehand-runner

v1.0.10

Published

4 days ago

基于 Stagehand + Playwright 的 AI 浏览器自动化任务执行器，支持视频录制、SOCKS5 代理与实时步骤回调

0High
0Medium
0Low

kevinke

stagehand playwright browser-automation ai-agent gemini web-scraping video-recording socks5-proxy cdp headless

stagehand-runner

基于 Stagehand + Playwright 的 AI 浏览器自动化任务执行器。

🤖 AI Agent 驱动：用自然语言指令控制浏览器，无需手写 Selector
📹 自动视频录制：每次任务全程录制，自动保存为 .webm
🔒 SOCKS5 代理支持：绕过 LLM API 的网络限制
📡 实时步骤回调：通过 onStep 实时获取 Agent 的每一步操作
🔌 CDP 接入：Stagehand 接入 Playwright 已启动的 Chrome，两者共享同一浏览器实例

安装

npm install stagehand-runner

还需要安装 peer dependencies：

npm install @browserbasehq/stagehand playwright
npx playwright install chromium

快速开始

import { runStagehandTask } from 'stagehand-runner';

const result = await runStagehandTask({
  url: 'https://www.example.com',
  apiKey: process.env.GOOGLE_API_KEY,  // Google Gemini API Key
  instruction: '找到页面上所有文章的标题和发布时间，整理成 JSON 数组返回',
  loginWaitTime: 0,
  onStep: (step) => {
    console.log('[Step]', step);
  },
});

console.log('结果：', result.result);
console.log('视频：', result.videoPath);

API 文档

`runStagehandTask(options)`

运行一次 AI 浏览器自动化任务。

参数

| 参数 | 类型 | 必填 | 默认值 | 说明 | |------|------|------|--------|------| | url | string | ✅ | — | 目标网页 URL | | apiKey | string | ✅ | — | LLM API Key（如 Google Gemini API Key） | | instruction | string | ✅ | — | Agent 执行的自然语言指令 | | loginWaitTime | number | — | 10000 | 页面加载后暂停等待的毫秒数（用于手动扫码/登录） | | modelName | string | — | "google/gemini-2.0-flash" | LLM 模型名称 | | systemPrompt | string | — | 中文助手提示词 | Agent 系统提示词 | | proxy | string | — | — | SOCKS5 代理地址，格式：socks5://user:pass@host:port | | outputDir | string | — | process.cwd()/recordings | 录制视频输出目录 | | onStep | (step: StepData) => void | — | — | 实时步骤回调 |

返回值

{
  success: boolean;      // 任务是否成功
  result: unknown;       // Agent 返回的最终数据
  videoPath: string | null; // 视频绝对路径
  sessionId: string;     // 会话 ID（如 "session-1718000000000"）
}

`setupSocks5Proxy(proxyUrl)`

手动初始化 SOCKS5 全局代理（幂等，多次调用只生效一次）。

适用于需要在多次任务中复用同一代理配置的场景。如果只是单次任务使用代理，直接传 proxy 参数即可。

import { setupSocks5Proxy } from 'stagehand-runner';

setupSocks5Proxy('socks5://user:[email protected]:1080');

进阶用法

需要登录的网站

const result = await runStagehandTask({
  url: 'https://app.example.com/login',
  apiKey: process.env.GOOGLE_API_KEY,
  instruction: '进入"我的订单"页面，提取最近 10 条订单的编号、金额和状态',
  loginWaitTime: 30000, // 等待 30 秒供用户手动登录
  onStep: ({ type, ...data }) => {
    if (type === 'status') console.log('📌', data.message);
  },
});

通过 SOCKS5 代理调用 LLM API

const result = await runStagehandTask({
  url: 'https://example.com',
  apiKey: process.env.GOOGLE_API_KEY,
  instruction: '提取商品列表',
  proxy: 'socks5://user:[email protected]:1080',
});

自定义视频输出目录

import path from 'path';

const result = await runStagehandTask({
  url: 'https://example.com',
  apiKey: process.env.GOOGLE_API_KEY,
  instruction: '截图并返回页面标题',
  outputDir: path.resolve('./my-recordings'),
});

console.log(result.videoPath);
// → /absolute/path/to/my-recordings/session-1718000000000/recording.webm

监听 Agent 步骤

onStep 回调会收到三种类型的数据：

onStep: (step) => {
  if (step.type === 'status') {
    // 任务状态消息（启动、等待、完成等）
    console.log(step.message);
  } else if (step.type === 'reasoning') {
    // Agent 的独立推理过程
    console.log('思考：', step.message);
  } else {
    // Agent 工具调用（act / extract / observe 等）
    console.log(`工具: ${step.tool}`);
    console.log(`描述: ${step.describe}`);
    console.log(`推理: ${step.reasoning}`);
  }
}

环境要求

Node.js >= 18
@browserbasehq/stagehand >= 3.0.0
playwright >= 1.40.0

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

stagehand-runner

安装

快速开始

API 文档

runStagehandTask(options)

参数

返回值

setupSocks5Proxy(proxyUrl)

进阶用法

需要登录的网站

通过 SOCKS5 代理调用 LLM API

自定义视频输出目录

监听 Agent 步骤

环境要求

License

`runStagehandTask(options)`

`setupSocks5Proxy(proxyUrl)`