npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

crawl4ai

v1.0.1

Published

TypeScript SDK for Crawl4AI REST API - Bun & Node.js compatible

Readme

Crawl4AI TypeScript SDK

A type-safe TypeScript SDK for the Crawl4AI REST API. Built for modern JavaScript/TypeScript environments with full Bun and Node.js compatibility.

🚀 Features

  • Full TypeScript Support - Complete type definitions for all API endpoints and responses
  • Bun & Node.js Compatible - Works seamlessly in both runtimes
  • Modern Async/Await - Promise-based API for all operations
  • Comprehensive Coverage - All Crawl4AI endpoints including specialized features
  • Smart Error Handling - Custom error classes with retry logic and timeouts
  • Batch Processing - Efficiently crawl multiple URLs in a single request
  • Input Validation - Built-in URL validation and parameter checking
  • Debug Mode - Optional request/response logging for development
  • Zero Dependencies - Uses only native fetch API

📦 Installation

Using Bun (Recommended)

bun add crawl4ai

Using npm/yarn

npm install crawl4ai
# or
yarn add crawl4ai

📚 About Crawl4AI

⚠️ Unofficial Package: This is an unofficial TypeScript SDK for the Crawl4AI REST API. This package was created for personal use to provide a type-safe way to interact with Crawl4AI's REST API.

🏗️ Prerequisites

  1. Crawl4AI Server Running

    You can use the hosted version or run your own:

    # Using Docker
    docker run -p 11235:11235 unclecode/crawl4ai:latest
       
    # With LLM support
    docker run -p 11235:11235 \
      -e OPENAI_API_KEY=your_openai_key \
      -e ANTHROPIC_API_KEY=your_anthropic_key \
      unclecode/crawl4ai:latest
  2. TypeScript (if using TypeScript)

    bun add -d typescript

🚀 Quick Start

Basic Usage

import Crawl4AI from 'crawl4ai';

// Initialize the client
const client = new Crawl4AI({
  baseUrl: 'https://example.com', // or your local instance
  apiToken: 'your_token_here', // Optional
  timeout: 30000,
  debug: true // Enable request/response logging
});

// Perform a basic crawl
const results = await client.crawl({
  urls: 'https://example.com',
  browser_config: {
    headless: true,
    viewport: { width: 1920, height: 1080 }
  },
  crawler_config: {
    cache_mode: 'bypass',
    word_count_threshold: 10
  }
});

const result = results[0]; // API returns array of results
console.log('Title:', result.metadata?.title);
console.log('Content:', result.markdown?.slice(0, 200));

Configuration Options

const client = new Crawl4AI({
  baseUrl: 'https://example.com',
  apiToken: 'optional_api_token',
  timeout: 60000,          // Request timeout in ms
  retries: 3,              // Number of retry attempts
  retryDelay: 1000,        // Delay between retries in ms
  throwOnError: true,      // Throw on HTTP errors
  debug: false,            // Enable debug logging
  defaultHeaders: {        // Additional headers
    'User-Agent': 'MyApp/1.0'
  }
});

📖 API Reference

Core Methods

crawl(request) - Main Crawl Endpoint

Crawl one or more URLs with full configuration options:

const results = await client.crawl({
  urls: ['https://example.com', 'https://example.org'],
  browser_config: {
    headless: true,
    simulate_user: true,
    magic: true // Anti-detection features
  },
  crawler_config: {
    cache_mode: 'bypass',
    extraction_strategy: {
      type: 'json_css',
      params: { /* CSS extraction config */ }
    }
  }
});

Content Generation

markdown(request) - Get Markdown

Extract markdown with various filters:

const markdown = await client.markdown({
  url: 'https://example.com',
  filter: 'fit',  // 'raw' | 'fit' | 'bm25' | 'llm'
  query: 'search query for bm25/llm filters'
});

html(request) - Get Processed HTML

Get sanitized HTML for schema extraction:

const html = await client.html({
  url: 'https://example.com'
});

screenshot(request) - Capture Screenshot

Capture full-page screenshots:

const screenshotBase64 = await client.screenshot({
  url: 'https://example.com',
  screenshot_wait_for: 2,  // Wait 2 seconds before capture
  output_path: '/path/to/save.png'  // Optional: save to file
});

pdf(request) - Generate PDF

Generate PDF documents:

const pdfData = await client.pdf({
  url: 'https://example.com',
  output_path: '/path/to/save.pdf'  // Optional: save to file
});

JavaScript Execution

executeJs(request) - Run JavaScript

Execute JavaScript on the page and get full crawl results:

const result = await client.executeJs({
  url: 'https://example.com',
  scripts: [
    'return document.title;',
    'return document.querySelectorAll("a").length;',
    'window.scrollTo(0, document.body.scrollHeight);'
  ]
});

console.log('JS Results:', result.js_execution_result);

AI/LLM Features

ask(params) - Get Library Context

Get Crawl4AI documentation for AI assistants:

const answer = await client.ask({
  query: 'extraction strategies',
  context_type: 'doc',  // 'code' | 'doc' | 'all'
  max_results: 10
});

llm(url, query) - LLM Endpoint

Process URLs with LLM:

const response = await client.llm(
  'https://example.com',
  'What is the main purpose of this website?'
);

Utility Methods

// Test connection
const isConnected = await client.testConnection();
// With error details
const isConnected = await client.testConnection({ throwOnError: true });

// Get health status
const health = await client.health();

// Get API version
const version = await client.version();
// With error details
const version = await client.version({ throwOnError: true });

// Get Prometheus metrics
const metrics = await client.metrics();

// Update configuration
client.setApiToken('new_token');
client.setBaseUrl('https://new-url.com');
client.setDebug(true);

🎯 Data Extraction Strategies

CSS Selector Extraction

Extract structured data using CSS selectors:

const results = await client.crawl({
  urls: 'https://news.ycombinator.com',
  crawler_config: {
    extraction_strategy: {
      type: 'json_css',
      params: {
        schema: {
          baseSelector: '.athing',
          fields: [
            {
              name: 'title',
              selector: '.titleline > a',
              type: 'text'
            },
            {
              name: 'url', 
              selector: '.titleline > a',
              type: 'href'
            },
            {
              name: 'score',
              selector: '+ tr .score',
              type: 'text'
            }
          ]
        }
      }
    }
  }
});

const posts = JSON.parse(results[0].extracted_content || '[]');

LLM-Based Extraction

Use AI models for intelligent data extraction:

const results = await client.crawl({
  urls: 'https://www.bbc.com/news',
  crawler_config: {
    extraction_strategy: {
      type: 'llm',
      params: {
        provider: 'openai/gpt-4o-mini',
        api_token: process.env.OPENAI_API_KEY,
        schema: {
          type: 'object',
          properties: {
            headline: { type: 'string' },
            summary: { type: 'string' },
            author: { type: 'string' },
            tags: {
              type: 'array',
              items: { type: 'string' }
            }
          }
        },
        extraction_type: 'schema',
        instruction: 'Extract news articles with their key information'
      }
    }
  }
});

Cosine Similarity Extraction

Filter content based on semantic similarity:

const results = await client.crawl({
  urls: 'https://example.com/blog',
  crawler_config: {
    extraction_strategy: {
      type: 'cosine',
      params: {
        semantic_filter: 'artificial intelligence machine learning',
        word_count_threshold: 50,
        max_dist: 0.3,
        top_k: 5
      }
    }
  }
});

🛠️ Error Handling

The SDK provides custom error handling with detailed information:

import { Crawl4AIError } from 'crawl4ai-sdk';

try {
  const results = await client.crawl({ urls: 'https://example.com' });
} catch (error) {
  if (error instanceof Crawl4AIError) {
    console.error('API Error:', error.message);
    console.error('Status:', error.status);
    console.error('Details:', error.data);
  } else {
    console.error('Unexpected error:', error);
  }
}

🧪 Testing

Run the test suite:

# Run all tests
bun test

# Run specific test file
bun test src/sdk.test.ts

📚 Examples

Run the included examples:

# Basic usage
bun run example:basic

# Advanced features
bun run example:advanced

# LLM extraction
bun run example:llm

🔒 Security & Best Practices

Authentication

Always use API tokens in production:

const client = new Crawl4AI({
  baseUrl: 'https://your-crawl4ai-server.com',
  apiToken: process.env.CRAWL4AI_API_TOKEN
});

Rate Limiting

Implement client-side throttling:

// Sequential processing with delays
for (const url of urls) {
  const results = await client.crawl({ urls: url });
  await new Promise(resolve => setTimeout(resolve, 1000)); // 1s delay
}

Input Validation

The SDK automatically validates URLs before making requests. Invalid URLs will throw a Crawl4AIError.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

This SDK is released under the MIT License.

🙏 Acknowledgments

Built for the amazing Crawl4AI project by @unclecode and the Crawl4AI community.