@alterlab/sdk

v2.2.0

Published

8 days ago

AlterLab Node.js SDK

0High
0Medium
0Low

@alterlab/sdk

The official Node.js/TypeScript client for the AlterLab web scraping API. Scrape any website with anti-bot bypass, structured extraction, and automatic tier escalation.

Installation

npm install @alterlab/sdk

Quick Start

import { AlterLab } from "@alterlab/sdk";

const client = new AlterLab({ apiKey: "sk_test_..." });

const result = await client.scrape({ url: "https://example.com" });
console.log(result.content);

Methods

`scrape(request)`

Scrape a single URL.

// Simple scrape
const result = await client.scrape({ url: "https://example.com" });

// JavaScript rendering with screenshot
const result = await client.scrape({
  url: "https://example.com",
  mode: "js",
  render_js: true,
  screenshot: true,
});
console.log(result.screenshot_url);

// Structured extraction
const result = await client.scrape({
  url: "https://example.com/product",
  extraction_profile: "product",
});
console.log(result.extracted_data);

// With JSON schema extraction
const result = await client.scrape({
  url: "https://example.com/product",
  extraction_schema: {
    type: "object",
    properties: {
      title: { type: "string" },
      price: { type: "number" },
    },
  },
});

ScrapeRequest fields:

| Field | Type | Default | Description | |-------|------|---------|-------------| | url | string | required | URL to scrape | | mode | string | "auto" | auto, html, js, pdf, ocr | | sync | boolean | true | Wait for result (false returns job_id) | | render_js | boolean | false | Enable JavaScript rendering (+3 credits) | | screenshot | boolean | false | Capture screenshot (+1 credit, needs render_js) | | markdown | boolean | false | Convert to markdown (free) | | wait_for | string | — | CSS selector to wait for (JS mode) | | timeout | number | — | Request timeout in seconds | | force_refresh | boolean | false | Bypass cache | | extraction_schema | object | — | JSON Schema for structured extraction | | extraction_prompt | string | — | Natural language extraction instructions | | extraction_profile | string | — | auto, product, article, job_posting, faq, recipe, event | | max_credits | number | — | Maximum credits for this request | | max_tier | string | — | Maximum tier: 0.5, 1, 1.5, 2, 3, 4 |

`crawl(request)`

Start a multi-page website crawl. Returns immediately with a crawl_id.

const crawl = await client.crawl({
  url: "https://example.com",
  maxPages: 200,
  maxDepth: 2,
  includePatterns: ["/blog/*"],
  formats: ["markdown"],
});
console.log(`Crawl ID: ${crawl.crawl_id}`);
console.log(`Estimated pages: ${crawl.estimated_pages}`);

CrawlRequest fields:

| Field | Type | Default | Description | |-------|------|---------|-------------| | url | string | required | Starting URL | | maxPages | number | 50 | Maximum pages (1-100,000) | | maxDepth | number | 3 | Link-following depth (0-50) | | includePatterns | string[] | — | Glob patterns to include | | excludePatterns | string[] | — | Glob patterns to exclude | | formats | string[] | — | Output formats per page | | extractionSchema | object | — | JSON Schema for each page | | extractionProfile | string | — | Extraction profile for each page | | webhookUrl | string | — | Webhook for crawl.completed | | respectRobots | boolean | true | Respect robots.txt | | includeSubdomains | boolean | false | Follow subdomain links | | costControls | object | — | { maxCredits, maxTier, forceTier } |

`crawlAndWait(request, options?)`

Start a crawl and poll until all pages are done.

const results = await client.crawlAndWait(
  {
    url: "https://example.com",
    maxPages: 50,
    formats: ["markdown"],
  },
  { pollTimeout: 300000 },
);
console.log(`Pages: ${results.completed}/${results.total}`);
for (const page of results.pages ?? []) {
  console.log(`  ${page.url}: ${page.status}`);
}

`getCrawl(crawlId, includeResults?)`

Get crawl progress. Pass true to include per-page results.

const status = await client.getCrawl(crawlId);
console.log(`${status.completed}/${status.total} pages done`);

`cancelCrawl(crawlId)`

Cancel a running crawl. Credits for unprocessed pages are refunded.

const result = await client.cancelCrawl(crawlId);
console.log(`Refunded ${result.credits_refunded} credits`);

`batchScrape(requests, webhookUrl?)`

Submit multiple scrape requests as a batch.

const batch = await client.batchScrape(
  [
    { url: "https://example.com/page1", mode: "html" },
    { url: "https://example.com/page2", mode: "js" },
  ],
  "https://myapp.com/webhook",
);

for (const jobId of batch.job_ids) {
  const result = await client.waitForJob(jobId);
  console.log(result.result?.content.length);
}

`estimateCost(request)`

Get a cost estimate before scraping.

const estimate = await client.estimateCost({
  url: "https://example.com",
  mode: "js",
});
console.log(`Estimated: ${estimate.estimated_credits} credits`);
console.log(`Max possible: ${estimate.max_possible_credits} credits`);

`getUsage()`

Check your credit balance and usage stats.

const usage = await client.getUsage();
console.log(`Credits remaining: ${usage.credits_available}`);
console.log(`Plan: ${usage.plan}`);

`waitForJob(jobId, options?)`

Poll an async job until completion with exponential backoff.

const result = await client.waitForJob(jobId, {
  pollInterval: 2000,
  pollTimeout: 300000,
  backoffMultiplier: 1.5,
  maxInterval: 30000,
});

`getJobStatus(jobId)`

Check job status without waiting.

const job = await client.getJobStatus(jobId);
console.log(`Status: ${job.status}`);

Client Options

const client = new AlterLab({
  apiKey: "sk_test_...",                    // Required
  baseUrl: "https://api.alterlab.io",       // Default API URL
  maxRetries: 3,                            // Retry count for transient failures
  retryDelay: 1000,                         // Initial retry delay in ms (exponential backoff)
  logger: silentLogger,                     // Custom logger (or silentLogger to disable)
});

The client automatically retries on 429 (rate limit) and 5xx errors with exponential backoff and Retry-After header support.

Custom Logger

import { silentLogger } from "@alterlab/sdk";

// Silent mode (no console output)
const client = new AlterLab({ apiKey: "...", logger: silentLogger });

// Custom logger
const client = new AlterLab({
  apiKey: "...",
  logger: {
    debug: (msg) => myLogger.debug(msg),
    warn: (msg) => myLogger.warn(msg),
    error: (msg) => myLogger.error(msg),
  },
});

TypeScript

The SDK exports full TypeScript types for all requests and responses:

import type {
  ScrapeRequest,
  ScrapeResponse,
  CrawlRequest,
  CrawlResponse,
  CrawlStatusResponse,
  BatchScrapeRequest,
  BatchResponse,
  CostEstimateRequest,
  CostEstimateResponse,
  JobStatusResponse,
  UsageStats,
  AlterLabOptions,
  WaitForJobOptions,
  CrawlAndWaitOptions,
} from "@alterlab/sdk";

Error Handling

try {
  const result = await client.scrape({ url: "https://example.com" });
} catch (error) {
  if (axios.isAxiosError(error)) {
    console.log(`HTTP ${error.response?.status}: ${error.response?.data}`);
  } else {
    console.log(`Error: ${error.message}`);
  }
}

Errors from the API include the HTTP status code and detail message in the Axios error response. The SDK retries transient errors (429, 5xx) automatically.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@alterlab/sdk

Installation

Quick Start

Methods

scrape(request)

crawl(request)

crawlAndWait(request, options?)

getCrawl(crawlId, includeResults?)

cancelCrawl(crawlId)

batchScrape(requests, webhookUrl?)

estimateCost(request)

getUsage()

waitForJob(jobId, options?)

getJobStatus(jobId)

Client Options

Custom Logger

TypeScript

Error Handling

Links

`scrape(request)`

`crawl(request)`

`crawlAndWait(request, options?)`

`getCrawl(crawlId, includeResults?)`

`cancelCrawl(crawlId)`

`batchScrape(requests, webhookUrl?)`

`estimateCost(request)`

`getUsage()`

`waitForJob(jobId, options?)`

`getJobStatus(jobId)`