@skrillex1224/playwright-toolkit
v2.1.12
Published
一个在 Apify/Crawlee Actor 中启用实时截图视图的实用工具库。
Maintainers
Readme
Playwright Toolkit
面向 Apify/Crawlee Actor 开发者的实用工具库,提供反检测、拟人化操作、实时截图等功能。
📦 安装
npm install @skrillex1224/playwright-toolkit
# 反检测所需的依赖
npm install playwright-extra puppeteer-extra-plugin-stealth ghost-cursor-playwright delay🚀 快速开始
import { Actor } from 'apify';
import { PlaywrightCrawler } from 'crawlee';
import { chromium } from 'playwright-extra';
import stealthPlugin from 'puppeteer-extra-plugin-stealth';
import { usePlaywrightToolKit } from '@skrillex1224/playwright-toolkit';
await Actor.init();
// 初始化工具箱
const { ApifyKit: KitHook, Launch, Stealth, Humanize, Captcha, LiveView, Constants } = usePlaywrightToolKit();
// ⚠️ ApifyKit 需要异步初始化
const ApifyKit = await KitHook.useApifyKit();
// 创建 Stealth 浏览器
const stealthChromium = Launch.createStealthChromium(chromium, stealthPlugin);
// LiveView
const { startLiveViewServer, takeLiveScreenshot } = LiveView.useLiveView();
const crawler = new PlaywrightCrawler({
launchContext: {
launcher: stealthChromium,
launchOptions: Launch.getAdvancedLaunchOptions(),
},
preNavigationHooks: [
async ({ page }) => {
// 同步视口 (防指纹检测)
await Stealth.syncViewportWithScreen(page);
// 验证码监控
Captcha.useCaptchaMonitor(page, {
domSelector: '#captcha_container',
onDetected: async () => { /* 处理验证码 */ }
});
}
],
requestHandler: async ({ page }) => {
// 初始化 Cursor
await Humanize.initializeCursor(page);
// 页面预热 (模拟人类浏览)
await Humanize.warmUpBrowsing(page, 3000);
// 执行步骤 (失败时自动截图并调用 Actor.fail)
await ApifyKit.runStep('输入搜索', page, async () => {
await Humanize.humanType(page, 'input', '搜索内容');
await Humanize.humanClick(page, '#submit-btn');
});
// 推送成功数据
await ApifyKit.pushSuccess({ result: 'data' });
}
});
await startLiveViewServer();
await crawler.run(['https://example.com']);
await Actor.exit();🛡️ 反检测功能
架构
| 层次 | 问题 | 解决方案 |
|------|------|----------|
| 指纹层 | navigator.webdriver, plugins, webgl | puppeteer-extra-plugin-stealth |
| 行为层 | 机械输入/点击/滚动 | ghost-cursor-playwright + Humanize |
| 页面层 | 验证码/风控检测 | Captcha 监控器 |
API 一览
| 模块 | 方法 | 说明 |
|------|------|------|
| Launch | createStealthChromium(chromium, stealthPlugin) | 注册 Stealth 插件 |
| Launch | getAdvancedLaunchOptions() | 增强版启动参数 |
| Launch | getLaunchOptions() | 基础启动参数 |
| Launch | getFingerprintGeneratorOptions() | 指纹生成器选项 |
| Stealth | syncViewportWithScreen(page) | 同步视口与屏幕 |
| Stealth | hideWebdriver(page) | 隐藏 webdriver |
| Stealth | setupBlockingResources(page, types?) | 资源拦截 |
| Stealth | setChinaTimezone(context) | 设置中国时区 (UTC+8) |
| Humanize | initializeCursor(page) | 初始化 Cursor (必须先调用) |
| Humanize | jitterMs(base, jitterPercent?) | 生成带抖动的毫秒数 (同步,返回 number) |
| Humanize | humanType(page, selector, text, options?) | 人类化输入 (baseDelay=180ms ±40%) |
| Humanize | humanClick(page, selector, options?) | 人类化点击 (reactionDelay=250ms ±40%) |
| Humanize | warmUpBrowsing(page, baseDuration?) | 页面预热 (3500ms ±40%) |
| Humanize | naturalScroll(page, direction?, distance?, steps?) | 自然滚动 (带惯性+抖动) |
| Humanize | simulateGaze(page, baseDurationMs?) | 模拟注视 (2500ms ±40%) |
| Humanize | randomSleep(baseMs, jitterPercent?) | 随机延迟 (±30% 抖动) |
| Captcha | useCaptchaMonitor(page, options) | 验证码监控 |
📦 模块详解
ApifyKit
⚠️ 需要异步初始化
const { ApifyKit: KitHook } = usePlaywrightToolKit();
const ApifyKit = await KitHook.useApifyKit();
// 执行步骤 (失败时自动截图 + 推送 Dataset + 调用 Actor.fail)
await ApifyKit.runStep('步骤名', page, async () => {
// 你的逻辑
});
// 宽松版 (失败时只抛出异常,不调用 Actor.fail)
await ApifyKit.runStepLoose('步骤名', page, async () => {
// 你的逻辑
});
// 推送成功数据 (data 字段会被包装)
await ApifyKit.pushSuccess({ key: 'value' });
// 输出: { code: 0, status: 'SUCCESS', timestamp: '...', data: { key: 'value' } }CrawlerError
自定义错误类,可携带 code 和 context,在 pushFailed 时自动解析:
const { Errors, Constants } = usePlaywrightToolKit();
const { CrawlerError } = Errors;
const { ErrorKeygen } = Constants;
// 简单用法 (只有 message)
throw new CrawlerError('未捕获 Feed 接口响应');
// 完整用法 (带 code 和 context)
throw new CrawlerError({
message: '登录失败',
code: ErrorKeygen.NotLogin, // 会作为 pushFailed 的 code 字段
context: { url: currentUrl, userId: '123' }
});
// 从普通 Error 转换
throw CrawlerError.from(originalError, {
code: ErrorKeygen.Chaptcha,
context: { step: '验证码检测' }
});
// pushFailed 输出:
// { code: 30000001, status: 'FAILED', error: {...}, context: {...}, meta: {...}, ... }LiveView
const { LiveView } = usePlaywrightToolKit();
const { startLiveViewServer, takeLiveScreenshot } = LiveView.useLiveView();
await startLiveViewServer();
await takeLiveScreenshot(page, '当前状态');Captcha
const { Captcha } = usePlaywrightToolKit();
// DOM 监控模式
Captcha.useCaptchaMonitor(page, {
domSelector: '#captcha_container',
onDetected: async () => { await Actor.fail('检测到验证码'); }
});
// URL 监控模式
Captcha.useCaptchaMonitor(page, {
urlPattern: '/captcha',
onDetected: async () => { await Actor.fail('检测到验证码'); }
});Constants
const { Constants } = usePlaywrightToolKit();
const { ErrorKeygen, Status, StatusCode } = Constants;
// ErrorKeygen: { NotLogin: 30000001, Chaptcha: 30000002 }
// Status: { Success: 'SUCCESS', Failed: 'FAILED' }
// StatusCode: { Success: 0, Failed: -1 }Utils
const { Utils } = usePlaywrightToolKit();
// 解析 SSE 流文本
const events = Utils.parseSseStream(sseText);
// 解析 Cookie 字符串
const cookies = Utils.parseCookies('key=value; key2=value2', '.example.com');
await page.context().addCookies(cookies);
// 全页面滚动截图 (自动检测所有滚动元素,强制展开后截图)
const base64Image = await Utils.fullPageScreenshot(page);
// 返回 base64 编码的 PNG 图片📄 License
ISC
