ai-bypass-captcha
v1.0.2
Published
AI-powered hCaptcha solver with provider fallback chain
Maintainers
Readme
ai-bypass-captcha
AI-powered CAPTCHA handling for Playwright E2E tests and browser automation.
Use Cases
- End-to-end testing of forms protected by CAPTCHA
- QA automation pipelines with Playwright
- Accessibility testing of CAPTCHA-protected workflows
- Browser automation research and development
Features
- Handles hCaptcha challenges using AI vision models
- Supports multiple AI providers with fallback chain
- Returns token usage for cost tracking
- Realistic browser interaction patterns
- Comprehensive browser environment configuration
Installation
npm install ai-bypass-captcha playwright
npx playwright install chromiumNote: Playwright is a required peer dependency.
Usage
import { chromium } from 'playwright'
import {
solveCaptcha,
applyStealthMode,
getStealthContextOptions,
getStealthLaunchArgs,
} from 'ai-bypass-captcha'
// Launch browser with optimized args
const browser = await chromium.launch({
args: getStealthLaunchArgs(),
})
// Create context with realistic options
const context = await browser.newContext(getStealthContextOptions())
// Configure browser environment
await applyStealthMode(context)
const page = await context.newPage()
await page.goto('https://site-with-hcaptcha.com')
const result = await solveCaptcha({
page,
providers: ['gemini'],
})
if (result.success) {
console.log('Token:', result.data.token)
console.log('Attempts:', result.data.attempts)
console.log('Tokens used:', result.data.tokensUsed.total)
} else {
console.error('Failed:', result.error)
}
await browser.close()Configuration
Set API keys via environment variables:
GEMINI_API_KEY=your_gemini_key
OPENAI_API_KEY=your_openai_keyOr pass them programmatically via providerConfig:
const result = await solveCaptcha({
page,
providers: ['gemini'],
providerConfig: {
gemini: { apiKey: 'your_key', model: 'gemini-2.5-flash' },
},
})API
solveCaptcha(input)
Handles a CAPTCHA challenge on the given page.
Input:
| Field | Type | Required | Description |
| -------------- | -------------------------------------------------- | -------- | -------------------------------------- |
| page | Page | Yes | Playwright Page instance |
| providers | ProviderName[] | No | Provider order (default: ['gemini']) |
| providerConfig | Partial<Record<ProviderName, ProviderOptions>> | No | Per-provider API key/model overrides |
| timeout | number | No | Timeout in ms (default: 120000) |
ProviderName = 'gemini' | 'openai'
Output:
{
success: boolean
error: string | null
data: {
token: string
provider: ProviderName
attempts: number
tokensUsed: { input: number; output: number; total: number }
} | null
totalTimeMs: number
}Browser Environment
| Function | Description |
| ---------------------------- | --------------------------------------------------------------------- |
| getStealthLaunchArgs() | Returns optimized browser launch args for realistic E2E testing |
| getStealthContextOptions() | Returns context options with realistic viewport and user-agent |
| applyStealthMode(context) | Configures browser environment with realistic fingerprints for testing |
Providers
| Provider | Status | | -------- | ----------- | | Gemini | Implemented | | OpenAI | Implemented |
Users provide their own API keys. See Configuration.
Performance
Results from 10 runs using Gemini (default model: gemini-2.5-flash):
| Metric | Value | | --------------- | ------------------------ | | Success rate | 100% (10/10) | | Avg time | 22.2s | | Avg attempts | 1.9 | | Avg tokens | 2493 (in: 1707, out: 66) | | Est. cost/solve | $0.000296 |
Examples
See examples/nfse-scraper for a complete example.
Disclaimer
This project is intended for educational purposes, authorized testing, and QA automation only. Users are solely responsible for ensuring their use complies with all applicable laws, regulations, and terms of service of the websites they interact with.
The authors assume no liability for misuse of this software. By using this package, you agree to use it responsibly and only on systems you own or have explicit authorization to test.
License
MIT
