@spider-cloud/spider-client
v0.2.0
Published
Isomorphic Javascript SDK for Spider Cloud services
Readme
Spider Cloud JavaScript SDK
The Spider Cloud JavaScript SDK offers a streamlined set of tools for web scraping and crawling, with capabilities that allow for comprehensive data extraction suitable for interfacing with AI language models. This SDK makes it easy to interact programmatically with the Spider Cloud API from any JavaScript or Node.js application.
Installation
You can install the Spider Cloud JavaScript SDK via npm:
npm install @spider-cloud/spider-clientOr with yarn:
yarn add @spider-cloud/spider-clientConfiguration
Before using the SDK, you will need to provide it with your API key. Obtain an API key from spider.cloud and either pass it directly to the constructor or set it as an environment variable SPIDER_API_KEY.
Usage
Here's a basic example to demonstrate how to use the SDK:
import { Spider } from "@spider-cloud/spider-client";
// Initialize the SDK with your API key
const app = new Spider({ apiKey: "YOUR_API_KEY" });
// Scrape a URL
const url = "https://spider.cloud";
app
.scrapeUrl(url)
.then((data) => {
console.log("Scraped Data:", data);
})
.catch((error) => {
console.error("Scrape Error:", error);
});
// Crawl a website
const crawlParams = {
limit: 5,
proxy_enabled: true,
metadata: false,
request: "http",
};
app
.crawlUrl(url, crawlParams)
.then((result) => {
console.log("Crawl Result:", result);
})
.catch((error) => {
console.error("Crawl Error:", error);
});A real world crawl example streaming the response.
import { Spider } from "@spider-cloud/spider-client";
// Initialize the SDK with your API key
const app = new Spider({ apiKey: "YOUR_API_KEY" });
// The target URL
const url = "https://spider.cloud";
// Crawl a website
const crawlParams = {
limit: 5,
metadata: true,
request: "http",
};
const stream = true;
const streamCallback = (data) => {
console.log(data["url"]);
};
app.crawlUrl(url, crawlParams, stream, streamCallback);Available Methods
scrapeUrl(url, params): Scrape data from a specified URL. Optional parameters can be passed to customize the scraping behavior.crawlUrl(url, params, stream): Begin crawling from a specific URL with optional parameters for customization and an optional streaming response.search(q, params): Perform a search and gather a list of websites to start crawling and collect resources.links(url, params): Retrieve all links from the specified URL with optional parameters.screenshot(url, params): Take a screenshot of the specified URL.transform(data, params): Perform a fast HTML transformation to markdown or text.unblocker(url, params): Unblock challenging websites with anti-bot bypass. Supports AI extraction withcustom_prompt.getCredits(): Retrieve account's remaining credits.
AI Studio Methods
AI Studio methods require an active AI Studio subscription.
aiCrawl(url, prompt, params): AI-guided crawling using natural language prompts.aiScrape(url, prompt, params): AI-guided scraping using natural language prompts.aiSearch(prompt, params): AI-enhanced web search using natural language queries.aiBrowser(url, prompt, params): AI-guided browser automation using natural language commands.aiLinks(url, prompt, params): AI-guided link extraction and filtering.
// AI Scrape example
const result = await app.aiScrape(
"https://example.com/products",
"Extract all product names, prices, and descriptions"
);Unblocker with AI Extraction
// Unblock and extract data using AI
const result = await app.unblocker("https://protected-site.com/products", {
custom_prompt: "Extract all product names and prices as JSON"
});
// Extracted data is available in result[0].metadata.extracted_dataUnblocker with JSON Schema Extraction
Use JSON Schema for structured, validated extraction output:
const result = await app.unblocker("https://protected-site.com/products", {
extraction_schema: {
name: "products",
description: "Product listing extraction",
schema: JSON.stringify({
type: "object",
properties: {
products: {
type: "array",
items: {
type: "object",
properties: {
name: { type: "string" },
price: { type: "number" }
},
required: ["name", "price"]
}
}
}
}),
strict: true
}
});
// Extracted data conforms to the schema in result[0].metadata.extracted_dataError Handling
The SDK provides robust error handling and will throw exceptions when it encounters critical issues. Always use .catch() on promises to handle these errors gracefully.
Contributing
Contributions are always welcome! Feel free to open an issue or submit a pull request on our GitHub repository.
License
The Spider Cloud JavaScript SDK is open-source and released under the MIT License.
