npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@monsoft/scraper

v0.0.1

Published

Web and social media scraping clients for ScrapingDog and ScrapeCreator APIs

Downloads

92

Readme

@monsoft/scraper

Web and social media scraping clients for ScrapingDog and ScrapeCreator APIs.

Features

  • ScrapingDog Client - Web scraping with JavaScript rendering support
  • ScrapeCreator Client - Social media profile and posts scraping
    • Instagram profiles and posts
    • TikTok profiles and posts
    • Facebook pages
  • Built-in retry logic with exponential backoff for rate limiting
  • TypeScript support with full type definitions
  • ESM module format

Installation

npm install @monsoft/scraper
# or
pnpm add @monsoft/scraper
# or
yarn add @monsoft/scraper

Configuration

API keys can be provided via constructor or environment variables:

| Client | Constructor Option | Environment Variable | | ------------------- | ------------------ | ----------------------- | | ScrapingDogClient | apiKey | SCRAPINGDOG_API_KEY | | ScrapeCreatorClient | apiKey | SCRAPECREATOR_API_KEY |

Usage

ScrapingDog Client

import { ScrapingDogClient } from "@monsoft/scraper";

// Option 1: Provide API key via constructor
const client = new ScrapingDogClient({ apiKey: "your-api-key" });

// Option 2: Use environment variable (SCRAPINGDOG_API_KEY)
const client = new ScrapingDogClient();

// Scrape a webpage
const html = await client.scrape("https://example.com");

// Scrape as markdown (recommended for text extraction)
const markdown = await client.scrapeAsMarkdown("https://example.com");

// Enable dynamic rendering for JavaScript-heavy sites
const content = await client.scrapeAsMarkdown("https://example.com", true);

ScrapeCreator Client

import { ScrapeCreatorClient } from "@monsoft/scraper";

// Option 1: Provide API key via constructor
const client = new ScrapeCreatorClient({ apiKey: "your-api-key" });

// Option 2: Use environment variable (SCRAPECREATOR_API_KEY)
const client = new ScrapeCreatorClient();

// Scrape Instagram profile
const instagramProfile = await client.scrapeInstagramProfile("username");

// Scrape Instagram posts (with optional pagination cursor)
const instagramPosts = await client.scrapeInstagramPosts("username");
const nextPage = await client.scrapeInstagramPosts(
  "username",
  posts.next_max_id
);

// Scrape TikTok profile
const tiktokProfile = await client.scrapeTiktokProfile("username");

// Scrape TikTok posts (with optional pagination cursor)
const tiktokPosts = await client.scrapeTiktokPosts("username");
const nextPage = await client.scrapeTiktokPosts("username", posts.max_cursor);

// Scrape Facebook page
const facebookPage = await client.scrapeFacebookProfile(
  "https://facebook.com/pagename"
);

Social Media Scrapers (High-Level API)

import {
  scrapeInstagramProfile,
  scrapeTiktokProfile,
  scrapeFacebookPage,
  extractInstagramHandle,
} from "@monsoft/scraper";

// Extract handle from URL
const handle = extractInstagramHandle("https://instagram.com/username");

// Scrape profiles (returns standardized result objects)
const instagramResult = await scrapeInstagramProfile(
  "https://instagram.com/username"
);
const tiktokResult = await scrapeTiktokProfile("https://tiktok.com/@username");
const facebookResult = await scrapeFacebookPage(
  "https://facebook.com/pagename"
);

Website Scraper (requires @monsoft/ai)

The scrapeWebsite function uses AI to extract structured business information from websites:

import { scrapeWebsite } from "@monsoft/scraper";

const result = await scrapeWebsite("https://example.com");

if (result.success) {
  console.log(result.data); // Extracted business information
  console.log(result.rawContent); // Raw markdown content
}

Note: This feature requires @monsoft/ai as a peer dependency.

API Reference

ScrapingDogClient

Constructor

new ScrapingDogClient(config?: ScrapingDogClientConfig)

| Option | Type | Description | | -------- | -------- | ----------------------------------------------------- | | apiKey | string | API key (falls back to SCRAPINGDOG_API_KEY env var) |

Methods

| Method | Parameters | Returns | Description | | ------------------ | ------------------------------------------------------------------ | ----------------- | ---------------------- | | scrape | url: string, options?: { dynamic?: boolean, markdown?: boolean } | Promise<string> | Scrape a URL | | scrapeAsMarkdown | url: string, dynamic?: boolean | Promise<string> | Scrape URL as markdown |

ScrapeCreatorClient

Constructor

new ScrapeCreatorClient(config?: ScrapeCreatorClientConfig)

| Option | Type | Description | | -------- | -------- | ------------------------------------------------------- | | apiKey | string | API key (falls back to SCRAPECREATOR_API_KEY env var) |

Methods

| Method | Parameters | Returns | Description | | ------------------------ | --------------------------------- | ------------------------------------------------ | ------------------------ | | scrapeInstagramProfile | handle: string | Promise<ScrapecreatorInstagramProfileResponse> | Scrape Instagram profile | | scrapeInstagramPosts | handle: string, cursor?: string | Promise<ScrapecreatorInstagramPostsResponse> | Scrape Instagram posts | | scrapeTiktokProfile | handle: string | Promise<ScrapecreatorTiktokProfileResponse> | Scrape TikTok profile | | scrapeTiktokPosts | handle: string, cursor?: number | Promise<ScrapecreatorTiktokPostsResponse> | Scrape TikTok posts | | scrapeFacebookProfile | facebookUrl: string | Promise<ScrapecreatorFacebookPageResponse> | Scrape Facebook page |

Rate Limiting

Both clients include built-in retry logic with exponential backoff for 429 (rate limit) errors:

  • Maximum 5 retries
  • Base delay: 1 second
  • Maximum delay: 30 seconds
  • Jitter applied to prevent thundering herd

License

MIT