npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@crawlgate/sdk

v1.0.1

Published

Official JavaScript/TypeScript SDK for CrawlGate Search Engine API

Downloads

228

Readme

@crawlgate/sdk

Official JavaScript/TypeScript SDK for CrawlGate Search Engine API.

Installation

npm install @crawlgate/sdk

Quick Start

import { CrawlGateClient } from '@crawlgate/sdk';

const client = new CrawlGateClient({
  apiKey: 'sk_live_...',
  apiUrl: 'https://api.crawlgate.io' // or your self-hosted URL
});

// Scrape a single page
const doc = await client.scrape('https://example.com');
console.log(doc.markdown);

Configuration

const client = new CrawlGateClient({
  // Required: API key (or set CRAWLGATE_API_KEY env var)
  apiKey: 'sk_live_...',

  // Optional: API URL (default: https://api.crawlgate.io)
  apiUrl: 'https://api.crawlgate.io',

  // Optional: Request timeout in ms (default: 90000)
  timeoutMs: 90000,

  // Optional: Max retries for failed requests (default: 3)
  maxRetries: 3,

  // Optional: Backoff factor for retries in seconds (default: 0.5)
  backoffFactor: 0.5
});

Engines

CrawlGate supports three scraping engines:

| Engine | Description | Best For | |--------|-------------|----------| | static | Axios + Cheerio (no browser) | Fast, simple pages | | dynamic | Playwright headless browser | JavaScript-heavy sites | | smart | Auto-selects static/dynamic | Cost optimization |

API Reference

Scrape

Scrape a single URL.

const doc = await client.scrape('https://example.com', {
  engine: 'smart',           // 'static' | 'dynamic' | 'smart'
  formats: ['markdown', 'html'],
  onlyMainContent: true,
  excludeTags: ['nav', 'footer'],
  waitFor: 1000,             // Wait ms before scraping (dynamic only)
  timeout: 30000,
  proxy: 'stealth'           // 'iproyal' | 'stealth' | 'tor'
});

console.log(doc.markdown);
console.log(doc.metadata?.title);

Scrape with LLM Extraction

Extract structured data using AI.

import { z } from 'zod';

const schema = z.object({
  productName: z.string(),
  price: z.number(),
  inStock: z.boolean(),
  features: z.array(z.string())
});

const doc = await client.scrape('https://example.com/product', {
  engine: 'smart',
  extract: {
    schema,
    systemPrompt: 'Extract product information from the page',
    provider: 'openai',        // 'openai' | 'anthropic'
    enableFallback: true       // Try other provider if primary fails
  }
});

console.log(doc.extract?.data);
// { productName: '...', price: 99.99, inStock: true, features: [...] }

Batch Scrape

Scrape multiple URLs in a single job.

// Method 1: Wait for completion (recommended)
const job = await client.batchScrape(
  ['https://a.com', 'https://b.com', 'https://c.com'],
  {
    options: {
      formats: ['markdown'],
      engine: 'smart'
    },
    pollInterval: 2000,    // Poll every 2 seconds
    timeout: 300           // Max wait time in seconds
  }
);

console.log(`Scraped ${job.completed} URLs`);
job.data.forEach(doc => {
  console.log(doc.url, doc.markdown?.length);
});

// Method 2: Manual polling
const { id } = await client.startBatchScrape(
  ['https://a.com', 'https://b.com'],
  { options: { formats: ['markdown'] } }
);

let status = await client.getBatchScrapeStatus(id);
while (status.status === 'scraping') {
  console.log(`Progress: ${status.completed}/${status.total}`);
  await new Promise(r => setTimeout(r, 2000));
  status = await client.getBatchScrapeStatus(id);
}

// Cancel a batch job
await client.cancelBatchScrape(id);

// Get errors
const errors = await client.getBatchScrapeErrors(id);
console.log('Failed URLs:', errors.errors.map(e => e.url));

Crawl

Crawl multiple pages from a website.

// Method 1: Wait for completion (recommended)
const job = await client.crawl('https://example.com', {
  limit: 10,
  engine: 'dynamic',
  formats: ['markdown'],
  pollInterval: 2000,    // Poll every 2 seconds
  timeout: 300           // Max wait time in seconds
});

console.log(`Crawled ${job.completed} pages`);
job.data.forEach(doc => {
  console.log(doc.url, doc.markdown?.length);
});

// Method 2: Manual polling
const { id } = await client.startCrawl('https://example.com', { limit: 10 });

let status = await client.getCrawlStatus(id);
while (status.status === 'scraping') {
  console.log(`Progress: ${status.completed}/${status.total}`);
  await new Promise(r => setTimeout(r, 2000));
  status = await client.getCrawlStatus(id);
}

// Cancel a crawl job
await client.cancelCrawl(id);

// Get errors
const errors = await client.getCrawlErrors(id);
console.log('Failed URLs:', errors.errors.map(e => e.url));
console.log('Blocked by robots.txt:', errors.robotsBlocked);

Extract (Standalone LLM Extraction)

Extract structured data from URLs using LLM.

import { z } from 'zod';

// With Zod schema
const result = await client.extract({
  urls: ['https://example.com/product'],
  schema: z.object({
    name: z.string(),
    price: z.number(),
    inStock: z.boolean(),
    features: z.array(z.string())
  }),
  systemPrompt: 'Extract product information from the page',
  provider: 'openai',
  timeout: 60
});

console.log(result.data);

// With natural language prompt
const result = await client.extract({
  urls: ['https://example.com/about'],
  prompt: 'Extract the company name, founding year, and list of team members',
  enableWebSearch: true
});

console.log(result.data);

// Manual polling
const { id } = await client.startExtract({
  urls: ['https://example.com'],
  schema: { name: 'string', price: 'number' }
});

let status = await client.getExtractStatus(id);
while (status.status === 'processing') {
  await new Promise(r => setTimeout(r, 2000));
  status = await client.getExtractStatus(id);
}
console.log(status.data);

Map

Discover all URLs on a website.

const result = await client.map('https://example.com', {
  engine: 'dynamic'
});

console.log(`Found ${result.count} URLs:`);
result.links.forEach(url => console.log(url));

Search

Search the web with optional scraping.

// Basic search
const results = await client.search('best restaurants in NYC', {
  limit: 10,
  lang: 'en',
  country: 'us'
});

results.data.forEach(r => {
  console.log(`${r.title}: ${r.url}`);
});

// Search + scrape each result
const results = await client.search('best laptops 2024', {
  limit: 5,
  scrapeOptions: {
    formats: ['markdown']
  },
  engine: 'smart'
});

results.data.forEach(r => {
  console.log(r.title);
  console.log(r.markdown?.substring(0, 500));
});

// Search + LLM extraction
const results = await client.search('iPhone reviews', {
  limit: 5,
  scrapeOptions: { formats: ['markdown'] },
  extract: {
    schema: z.object({
      sentiment: z.enum(['positive', 'negative', 'neutral']),
      rating: z.number().optional(),
      summary: z.string()
    }),
    systemPrompt: 'Analyze the review sentiment'
  }
});

console.log(results.extract?.data);

Usage & Monitoring

Monitor your API usage.

// Get concurrency usage
const { concurrency, maxConcurrency } = await client.getConcurrency();
console.log(`Using ${concurrency}/${maxConcurrency} concurrent requests`);

// Get credit usage
const credits = await client.getCreditUsage();
console.log(`Remaining credits: ${credits.remainingCredits}`);

// Get token usage (for LLM extraction)
const tokens = await client.getTokenUsage();
console.log(`Remaining tokens: ${tokens.remainingTokens}`);

// Get queue status
const queue = await client.getQueueStatus();
console.log(`Jobs in queue: ${queue.jobsInQueue}`);
console.log(`Active: ${queue.activeJobsInQueue}, Waiting: ${queue.waitingJobsInQueue}`);

Error Handling

import {
  CrawlGateClient,
  CrawlGateError,
  AuthenticationError,
  ValidationError,
  JobTimeoutError,
  RateLimitError
} from '@crawlgate/sdk';

try {
  const doc = await client.scrape('https://example.com');
} catch (error) {
  if (error instanceof AuthenticationError) {
    console.error('Invalid API key');
  } else if (error instanceof ValidationError) {
    console.error('Invalid request:', error.message);
  } else if (error instanceof JobTimeoutError) {
    console.error(`Job ${error.jobId} timed out after ${error.timeoutSeconds}s`);
  } else if (error instanceof RateLimitError) {
    console.error('Rate limited, retry after:', error.retryAfter);
  } else if (error instanceof CrawlGateError) {
    console.error('API error:', error.message, error.statusCode);
  }
}

TypeScript Support

Full TypeScript support with exported types:

import type {
  CrawlGateClientOptions,
  ScrapeOptions,
  CrawlOptions,
  MapOptions,
  SearchOptions,
  Document,
  CrawlJob,
  SearchResponse,
  Engine,
  ExtractOptions,
  // Batch scrape
  BatchScrapeOptions,
  BatchScrapeJob,
  BatchScrapeResponse,
  // Extract
  ExtractRequestOptions,
  ExtractResponse,
  // Usage
  ConcurrencyInfo,
  CreditUsage,
  TokenUsage,
  QueueStatus,
  // Errors
  CrawlError,
  CrawlErrorsResponse
} from '@crawlgate/sdk';

Environment Variables

# API key (used if not passed to constructor)
CRAWLGATE_API_KEY=sk_live_...

# API URL (used if not passed to constructor)
CRAWLGATE_API_URL=https://api.crawlgate.io

Documentation

Full documentation available at docs.crawlgate.io

Support