@crawlkit-sh/sdk
v1.1.0
Published
Official TypeScript/JavaScript SDK for CrawlKit - the modern web scraping API. Scrape websites, extract structured data with AI, capture screenshots, and collect social media data.
Maintainers
Readme
CrawlKit SDK
Official TypeScript/JavaScript SDK for CrawlKit - the modern web scraping API
Turn any website into structured data with a single API call. CrawlKit handles proxies, JavaScript rendering, anti-bot detection, and data extraction so you can focus on building.
Features
- Web Scraping - Convert any webpage to clean Markdown, HTML, or raw content
- AI Data Extraction - Extract structured data using JSON Schema with LLM
- Web Search - Search the web programmatically
- Screenshots - Capture full-page screenshots
- LinkedIn Scraping - Scrape company profiles and person profiles
- Instagram Scraping - Scrape profiles and posts/reels
- TikTok Scraping - Scrape profiles, posts, and paginated post lists
- App Store Data - Fetch reviews and details from Google Play & Apple App Store
- Browser Automation - Click, type, scroll, and execute JavaScript
- TypeScript First - Full type safety with comprehensive type definitions
- Zero Dependencies - Uses native fetch, works in Node.js 18+ and browsers
Installation
npm install @crawlkit-sh/sdkyarn add @crawlkit-sh/sdkpnpm add @crawlkit-sh/sdkQuick Start
import { CrawlKit } from '@crawlkit-sh/sdk';
const crawlkit = new CrawlKit({ apiKey: 'ck_your_api_key' });
// Scrape a webpage
const page = await crawlkit.scrape({ url: 'https://example.com' });
console.log(page.markdown);
console.log(page.metadata.title);Get your API key at crawlkit.sh
Examples
Web Scraping
Scrape any webpage and get clean, structured content:
const result = await crawlkit.scrape({
url: 'https://example.com/blog/article',
options: {
onlyMainContent: true, // Remove navigation, footers, etc.
waitFor: '#content', // Wait for element before scraping
}
});
console.log(result.markdown); // Clean markdown content
console.log(result.html); // Cleaned HTML
console.log(result.metadata.title); // Page title
console.log(result.metadata.author); // Author if available
console.log(result.links.internal); // Internal links found
console.log(result.links.external); // External links foundAI-Powered Data Extraction
Extract structured data from any page using JSON Schema:
interface Product {
name: string;
price: number;
currency: string;
description: string;
inStock: boolean;
reviews: { rating: number; count: number };
}
const result = await crawlkit.extract<Product>({
url: 'https://example.com/product/123',
schema: {
type: 'object',
properties: {
name: { type: 'string' },
price: { type: 'number' },
currency: { type: 'string' },
description: { type: 'string' },
inStock: { type: 'boolean' },
reviews: {
type: 'object',
properties: {
rating: { type: 'number' },
count: { type: 'number' }
}
}
}
},
options: {
prompt: 'Extract product information from this e-commerce page'
}
});
// TypeScript knows result.json is Product
console.log(`${result.json.name}: $${result.json.price}`);
console.log(`In stock: ${result.json.inStock}`);Browser Automation
Handle SPAs, dynamic content, and interactive pages:
const result = await crawlkit.scrape({
url: 'https://example.com/spa',
options: {
waitFor: '.content-loaded',
actions: [
{ type: 'click', selector: '#accept-cookies' },
{ type: 'wait', milliseconds: 1000 },
{ type: 'click', selector: '#load-more' },
{ type: 'scroll', direction: 'down' },
{ type: 'type', selector: '#search', text: 'query' },
{ type: 'press', key: 'Enter' },
{ type: 'wait', milliseconds: 2000 },
]
}
});Web Search
Search the web and get structured results:
const result = await crawlkit.search({
query: 'typescript best practices 2024',
options: {
maxResults: 20,
timeRange: 'm', // Past month: 'd', 'w', 'm', 'y'
region: 'us-en'
}
});
for (const item of result.results) {
console.log(`${item.position}. ${item.title}`);
console.log(` ${item.url}`);
console.log(` ${item.snippet}\n`);
}Screenshots
Capture full-page screenshots:
const result = await crawlkit.screenshot({
url: 'https://example.com',
options: {
width: 1920,
height: 1080,
waitForSelector: '#main-content'
}
});
console.log('Screenshot URL:', result.url);LinkedIn Scraping
Scrape LinkedIn company and person profiles:
// Company profile
const company = await crawlkit.linkedin.company({
url: 'https://www.linkedin.com/company/openai',
options: { includeJobs: true }
});
console.log(company.company.name);
console.log(company.company.industry);
console.log(company.company.followers);
console.log(company.company.description);
console.log(company.company.employees);
console.log(company.company.jobs);
// Person profiles (batch up to 10)
const people = await crawlkit.linkedin.person({
url: [
'https://www.linkedin.com/in/user1',
'https://www.linkedin.com/in/user2'
]
});
console.log(`Success: ${people.successCount}, Failed: ${people.failedCount}`);
people.persons.forEach(p => console.log(p.person));Instagram Scraping
Scrape Instagram profiles and content:
// Profile
const profile = await crawlkit.instagram.profile({
username: 'instagram'
});
console.log(profile.profile.full_name);
console.log(profile.profile.follower_count);
console.log(profile.profile.following_count);
console.log(profile.profile.biography);
console.log(profile.profile.posts); // Recent posts
// Post/Reel content
const post = await crawlkit.instagram.content({
shortcode: 'CxIIgCCq8mg' // or full URL
});
console.log(post.post.like_count);
console.log(post.post.comment_count);
console.log(post.post.video_url);
console.log(post.post.caption);App Store Data
Fetch app reviews and details:
// Google Play Store reviews with pagination
let cursor: string | null = null;
do {
const reviews = await crawlkit.appstore.playstoreReviews({
appId: 'com.example.app',
cursor,
options: { lang: 'en' }
});
reviews.reviews.forEach(r => {
console.log(`${r.rating}/5: ${r.text}`);
if (r.developerReply) {
console.log(` Reply: ${r.developerReply.text}`);
}
});
cursor = reviews.pagination.nextCursor;
} while (cursor);
// Google Play Store app details
const details = await crawlkit.appstore.playstoreDetail({
appId: 'com.example.app'
});
console.log(details.appName);
console.log(details.rating);
console.log(details.installs);
console.log(details.description);
// Apple App Store reviews
const iosReviews = await crawlkit.appstore.appstoreReviews({
appId: '123456789'
});
// Apple App Store app details
const iosDetails = await crawlkit.appstore.appstoreDetail({
appId: '1492793493'
});
console.log(iosDetails.appName);TikTok Scraping
Scrape TikTok profiles and content:
const profile = await crawlkit.tiktok.profile({
username: 'nike'
});
console.log(profile.profile.username);
console.log(profile.profile.stats?.followers);
const post = await crawlkit.tiktok.content({
url: 'https://www.tiktok.com/@nike/video/1234567890'
});
console.log(post.post.id);
console.log(post.post.mediaType);
const posts = await crawlkit.tiktok.posts({
username: 'nike'
});
console.log(posts.posts.length);
console.log(posts.pagination.hasMore);Error Handling
The SDK provides typed error classes for different scenarios:
import {
CrawlKit,
CrawlKitError,
AuthenticationError,
InsufficientCreditsError,
ValidationError,
RateLimitError,
TimeoutError,
NotFoundError,
NetworkError
} from '@crawlkit-sh/sdk';
try {
const result = await crawlkit.scrape({ url: 'https://example.com' });
} catch (error) {
if (error instanceof AuthenticationError) {
console.log('Invalid API key');
} else if (error instanceof InsufficientCreditsError) {
console.log(`Not enough credits. Available: ${error.creditsRemaining}`);
} else if (error instanceof RateLimitError) {
console.log('Rate limit exceeded, please slow down');
} else if (error instanceof TimeoutError) {
console.log('Request timed out');
} else if (error instanceof ValidationError) {
console.log(`Invalid request: ${error.message}`);
} else if (error instanceof NetworkError) {
console.log(`Network error [${error.code}]: ${error.message}`);
} else if (error instanceof CrawlKitError) {
console.log(`API Error [${error.code}]: ${error.message}`);
console.log(`Status: ${error.statusCode}`);
if (error.creditsRefunded) {
console.log(`Credits refunded: ${error.creditsRefunded}`);
}
}
}Configuration
const crawlkit = new CrawlKit({
// Required: Your API key (get it at crawlkit.sh)
apiKey: 'ck_your_api_key',
// Optional: Custom base URL (default: https://api.crawlkit.sh)
baseUrl: 'https://api.crawlkit.sh',
// Optional: Default timeout in ms (default: 30000)
timeout: 60000,
// Optional: Custom fetch implementation
fetch: customFetch
});Credit Costs
| Operation | Credits |
|-----------|---------|
| scrape() | 1 |
| extract() | 5 |
| search() | 1 per page (~10 results) |
| screenshot() | 1 |
| linkedin.company() | 1 |
| linkedin.person() | 3 per URL |
| instagram.profile() | 1 |
| instagram.content() | 1 |
| appstore.playstoreReviews() | 1 per page |
| appstore.playstoreDetail() | 1 |
| appstore.appstoreDetail() | 1 |
| appstore.appstoreReviews() | 1 per page |
| tiktok.profile() | 1 |
| tiktok.content() | 1 |
| tiktok.posts() | 1 per page |
TypeScript Support
This SDK is written in TypeScript and provides comprehensive type definitions for all methods and responses. Enable strict mode in your tsconfig.json for the best experience:
{
"compilerOptions": {
"strict": true
}
}Requirements
- Node.js 18.0.0 or higher (for native fetch support)
- Or any modern browser with fetch support
Documentation
For detailed API documentation and guides, visit docs.crawlkit.sh
Support
License
MIT License - see LICENSE for details.
Built with love by CrawlKit
