clearscrape
v1.1.0
Published
Official Node.js SDK for ClearScrape - Web Scraping API
Downloads
219
Maintainers
Readme
ClearScrape Node.js SDK
Official Node.js client for the ClearScrape web scraping API.
Features
- Simple, promise-based API
- Full TypeScript support
- Automatic retries with exponential backoff
- Support for all ClearScrape features:
- JavaScript rendering
- Premium residential proxies
- Antibot bypass
- Screenshots
- Domain-specific extractors (Amazon, Walmart, Google, etc.)
- Scraping Browser (Playwright/Puppeteer)
- Residential Proxy service
Installation
npm install clearscrapeyarn add clearscrapepnpm add clearscrapeQuick Start
import { ClearScrape } from 'clearscrape';
const client = new ClearScrape({
apiKey: process.env.CLEARSCRAPE_API_KEY
});
// Basic scrape
const result = await client.scrape({
url: 'https://example.com'
});
console.log(result.data.html);Usage Examples
Basic Scraping
// Simple HTML fetch
const result = await client.scrape({
url: 'https://example.com'
});
// Get just the HTML
const html = await client.getHtml('https://example.com');
// Get just the text content
const text = await client.getText('https://example.com');JavaScript Rendering
Enable JavaScript rendering for dynamic websites (SPAs, React, Vue, etc.):
const result = await client.scrape({
url: 'https://example.com/spa-page',
jsRender: true,
waitFor: '.product-list', // Wait for element
wait: 3000 // Additional wait time (ms)
});Premium Proxies
Use residential proxies to avoid blocks and geo-target:
const result = await client.scrape({
url: 'https://example.com',
premiumProxy: true,
proxyCountry: 'us' // Target specific country
});Antibot Bypass
Bypass Cloudflare, DataDome, PerimeterX and other bot protection:
const result = await client.scrape({
url: 'https://protected-site.com',
antibot: true,
premiumProxy: true
});Screenshots
Capture screenshots of web pages:
import fs from 'fs';
// Get base64 screenshot
const screenshot = await client.screenshot('https://example.com');
// Save to file
fs.writeFileSync('screenshot.png', Buffer.from(screenshot, 'base64'));
// Screenshot specific element
const result = await client.scrape({
url: 'https://example.com',
jsRender: true,
screenshotSelector: '.product-card'
});Domain Extractors
Extract structured data from supported websites:
import { AmazonProduct, GoogleSerpResult } from 'clearscrape';
// Amazon product data
const product = await client.extract<AmazonProduct>(
'https://www.amazon.com/dp/B09V3KXJPB',
'amazon'
);
console.log(product.title); // "Apple AirPods Pro..."
console.log(product.price); // "$249.00"
console.log(product.rating); // "4.7"
console.log(product.reviewCount); // "125,432"
// Google SERP data
const serp = await client.extract<GoogleSerpResult>(
'https://www.google.com/search?q=best+laptops',
'google'
);
console.log(serp.organicResults[0].title);
console.log(serp.featuredSnippet);
console.log(serp.relatedSearches);Supported domains:
amazon- Product pageswalmart- Product pagesgoogle- Search resultsgoogle_shopping- Shopping resultsebay- Product pagestarget- Product pagesetsy- Product pagesbestbuy- Product pageshomedepot- Product pageszillow- Property listingsyelp- Business pagesindeed- Job listingslinkedin_jobs- Job listings
Scraping Browser (Playwright/Puppeteer)
Connect to cloud browsers with built-in antibot bypass:
// With Playwright
import { chromium } from 'playwright';
const browser = await chromium.connectOverCDP(
client.getBrowserWsUrl()
);
const page = await browser.newPage();
await page.goto('https://example.com');
const title = await page.title();
await browser.close();// With Puppeteer
import puppeteer from 'puppeteer-core';
const browser = await puppeteer.connect({
browserWSEndpoint: client.getBrowserWsUrl()
});
const page = await browser.newPage();
await page.goto('https://example.com');
await browser.close();// With country targeting
const wsUrl = client.getBrowserWsUrl({ proxyCountry: 'gb' });Residential Proxies
Use ClearScrape proxies with any HTTP client:
// Get proxy configuration
const proxy = client.getProxyConfig();
// { host: 'proxy.clearscrape.io', port: 8000, username: '...', password: '...' }
// Get proxy URL string
const proxyUrl = client.getProxyUrl();
// 'http://apikey:[email protected]:8000'
// With country targeting
const proxyUrl = client.getProxyUrl({ country: 'us' });
// With session sticky IP
const proxyUrl = client.getProxyUrl({ session: 'my-session-123' });
// Combined
const proxyUrl = client.getProxyUrl({ country: 'us', session: 'abc' });Use with axios:
import axios from 'axios';
import { HttpsProxyAgent } from 'https-proxy-agent';
const agent = new HttpsProxyAgent(client.getProxyUrl({ country: 'us' }));
const response = await axios.get('https://httpbin.org/ip', {
httpsAgent: agent
});Use with node-fetch:
import fetch from 'node-fetch';
import { HttpsProxyAgent } from 'https-proxy-agent';
const agent = new HttpsProxyAgent(client.getProxyUrl());
const response = await fetch('https://httpbin.org/ip', { agent });Configuration
const client = new ClearScrape({
// Required: Your API key
apiKey: 'your-api-key',
// Optional: Custom base URL (default: https://clearscrape.io/api)
baseUrl: 'https://clearscrape.io/api',
// Optional: Request timeout in ms (default: 60000)
timeout: 60000,
// Optional: Number of retries (default: 3)
retries: 3
});Error Handling
import {
ClearScrape,
ClearScrapeError,
InsufficientCreditsError,
RateLimitError
} from 'clearscrape';
try {
const result = await client.scrape({ url: 'https://example.com' });
} catch (error) {
if (error instanceof InsufficientCreditsError) {
console.log(`Need ${error.required} credits`);
} else if (error instanceof RateLimitError) {
console.log('Rate limited, try again later');
} else if (error instanceof ClearScrapeError) {
console.log(`Error ${error.statusCode}: ${error.message}`);
}
}TypeScript
The SDK is written in TypeScript and includes full type definitions:
import {
ClearScrape,
ScrapeOptions,
ScrapeResponse,
AmazonProduct,
GoogleSerpResult,
DomainType
} from 'clearscrape';
// Full type safety
const options: ScrapeOptions = {
url: 'https://example.com',
jsRender: true,
premiumProxy: true
};
const result: ScrapeResponse = await client.scrape(options);API Reference
ClearScrape
Main client class.
Constructor
new ClearScrape(config: ClearScrapeConfig)Methods
| Method | Description |
|--------|-------------|
| scrape(options) | Scrape a URL with full options |
| getHtml(url, options?) | Get HTML content only |
| getText(url, options?) | Get text content only |
| screenshot(url, options?) | Capture screenshot |
| extract(url, domain) | Extract structured data |
| getProxyConfig(options?) | Get proxy configuration object |
| getProxyUrl(options?) | Get proxy URL string |
| getBrowserWsUrl(options?) | Get Scraping Browser WebSocket URL |
ScrapeOptions
| Option | Type | Description |
|--------|------|-------------|
| url | string | Target URL (required) |
| method | string | HTTP method (default: GET) |
| jsRender | boolean | Enable JS rendering (+5 credits) |
| premiumProxy | boolean | Use residential proxy (+10 credits) |
| antibot | boolean | Enable antibot bypass (+25 credits) |
| proxyCountry | string | 2-letter country code |
| waitFor | string | CSS selector to wait for |
| wait | number | Wait time in ms |
| autoScroll | boolean | Scroll to load content |
| screenshot | boolean | Capture screenshot |
| screenshotSelector | string | Screenshot specific element |
| headers | object | Custom HTTP headers |
| body | string\|object | Request body |
| domain | DomainType | Domain extractor |
Credits
| Feature | Cost | |---------|------| | Base request | 1 credit | | + JavaScript rendering | +5 credits | | + Premium proxy | +10 credits | | + Antibot bypass | +25 credits | | Domain API extraction | 25 credits |
Support
License
MIT
