firecrawl-simple-client
v1.0.2
Published
A TypeScript API client library for Firecrawl
Downloads
24
Maintainers
Readme
Firecrawl Simple API Client
A TypeScript API client library for Firecrawl Simple.
What is Firecrawl Simple?
Firecrawl Simple is a stripped down and stable version of Firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are completely removed.
Installation
npm install firecrawl-simple-clientStructure
The API client is organized for better flexibility and maintainability:
src/
├── firecrawl-client.ts - Main API client class implementation
├── config.ts - Configuration types and defaults
├── index.ts - Entry point exporting the API client and types
└── client/ - Auto-generated API client codeUsage
Creating a Client Instance
import { FirecrawlClient } from 'firecrawl-simple-client';
// Create a client with default configuration (localhost:3002/v1)
const client = new FirecrawlClient();
// Create a client with custom configuration
const clientWithConfig = new FirecrawlClient({
apiUrl: 'https://api.firecrawl.com/v1',
apiKey: 'your-api-key',
});Basic Usage
import { FirecrawlClient } from 'firecrawl-simple-client';
const client = new FirecrawlClient({
apiUrl: 'https://api.firecrawl.com/v1',
apiKey: 'your-api-key',
});
// Start a crawl job
const crawlResult = await client.startCrawl({
url: 'https://example.com',
maxDepth: 3,
limit: 100 // Changed from maxPages to limit
});
// Check crawl status
const crawlStatus = await client.getCrawlStatus(crawlResult.id);
// Cancel a crawl job
await client.cancelCrawl(crawlResult.id);
// Scrape a single webpage (synchronous operation)
const scrapeResult = await client.scrapeWebpage({
url: 'https://example.com',
waitFor: 0, // Time in ms to wait for JavaScript execution
formats: ['markdown', 'html'],
timeout: 30000
});
// Access scrape results directly
console.log(scrapeResult.data.markdown);
// Generate a sitemap
const sitemapResult = await client.generateSitemap({
url: 'https://example.com'
});
// Check API health
const liveness = await client.checkLiveness();
const readiness = await client.checkReadiness();Configuration
The client can be configured when creating an instance:
import { FirecrawlClient } from 'firecrawl-simple-client';
// Default configuration
const DEFAULT_CONFIG = {
apiUrl: 'http://localhost:3002/v1',
};
// Create a client with custom configuration
const client = new FirecrawlClient({
apiUrl: 'https://api.firecrawl.com/v1',
apiKey: 'your-api-key',
});
// Get the current configuration
const config = client.getConfig();
console.log(config);API Reference
FirecrawlClient
The main client class for interacting with the Firecrawl API.
Constructor
new FirecrawlClient(config?: Partial<FirecrawlConfig>)Methods
getConfig(): Returns the current configurationstartCrawl(options): Start a new web crawling jobgetCrawlStatus(jobId): Get the status of a crawl jobcancelCrawl(jobId): Cancel a running crawl jobscrapeWebpage(options): Scrape a single webpage (synchronous operation)generateSitemap(options): Generate a sitemap for a website
Deprecated Methods
getScrapeStatus(jobId): Deprecated as scrape operations are now synchronouscheckLiveness(): No longer supported in the new APIcheckReadiness(): No longer supported in the new API
Scrape Options
{
url: string; // The URL to scrape (required)
formats?: Array<string>; // Formats to include: 'markdown', 'html', 'rawHtml', 'links', 'screenshot', 'extract', 'screenshot@fullPage'
includeTags?: Array<string>; // HTML tags to include in the result
excludeTags?: Array<string>; // HTML tags to exclude from the result
headers?: object; // Custom headers for the request
waitFor?: number; // Time in ms to wait for JavaScript execution
timeout?: number; // Request timeout in milliseconds
extract?: { // LLM extraction configuration
schema?: object; // Schema for structured data extraction
systemPrompt?: string; // System prompt for extraction
prompt?: string; // User prompt for extraction
}
}Crawl Options
{
url: string; // The URL to start crawling from (required)
maxDepth?: number; // Maximum depth to crawl
limit?: number; // Maximum number of pages to crawl (formerly maxPages)
includePaths?: Array<string>; // URL patterns to include (formerly includeUrls)
excludePaths?: Array<string>; // URL patterns to exclude (formerly excludeUrls)
ignoreSitemap?: boolean; // Whether to ignore the website's sitemap
allowBackwardLinks?: boolean; // Allow navigation to previously linked pages
allowExternalLinks?: boolean; // Allow following links to external websites
scrapeOptions?: object; // Options for scraping each page
}Error Handling
The API may return the following error codes:
402: Payment Required - You've exceeded your usage limits429: Too Many Requests - Rate limit exceeded404: Not Found - Resource not found500: Server Error - Internal server error
License
MIT
