npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

markudown-sdk

v0.1.0

Published

Node.js / TypeScript SDK for MarkUDown — AI data extraction infrastructure

Readme

MarkUDown Node.js / TypeScript SDK

TypeScript SDK for MarkUDown — AI data extraction infrastructure that turns any website into structured, ready-to-use data.

Requires Node 18+ (uses native fetch). Zero dependencies.

Installation

npm install markudown
# or
pnpm add markudown
# or
yarn add markudown

Quick Start

import { MarkUDown } from "markudown";

const md = new MarkUDown({ apiKey: "your-api-key" });

// Scrape a page
const result = await md.scrape("https://example.com");

// Search the web
const results = await md.search("best web scraping tools 2025", {
  engine: "google",
  limit: 5,
});

// Extract structured data
const data = await md.extract(
  "https://shop.example.com/products",
  [
    { name: "title",  type: "string", active: true },
    { name: "price",  type: "float",  active: true },
    { name: "rating", type: "float",  active: true },
  ],
  { extractQuery: "Extract all product listings" }
);

Get your API key at scrapetechnology.com/markudown. New accounts include 500 free credits.


Configuration

const md = new MarkUDown({
  apiKey: "your-api-key",
  baseUrl: "https://api.scrapetechnology.com", // default
  timeout: 120_000,      // HTTP timeout in ms (default: 120 000)
  pollInterval: 2_000,   // ms between status polls (default: 2 000)
  pollTimeout: 300_000,  // max ms to wait for a job (default: 300 000)
});

All async endpoints accept wait: false to return the job ID immediately:

const job = await md.crawl("https://example.com", { wait: false }) as { id: string };

// Check later
const status = await md.getJobStatus("crawl", job.id);

API Reference

scrape(url, opts?) → Promise<unknown>

Scrape a single URL. Returns result immediately.

const result = await md.scrape("https://example.com", {
  mainContent: true,
  includeLinks: true,
  excludeTags: ["header", "nav", "footer"],
});

| Option | Type | Default | Description | |--------|------|---------|-------------| | timeout | number | 60 | Timeout in seconds | | excludeTags | string[] | ["header","nav","footer"] | HTML tags to strip | | mainContent | boolean | true | Extract only main content | | includeLinks | boolean | false | Include hyperlinks | | includeHtml | boolean | false | Include raw HTML |


screenshot(url, opts?) → Promise<unknown>

Take a screenshot of a URL.

const result = await md.screenshot("https://example.com");
// result.image → base64 PNG

map(url, opts?) → Promise<unknown>

Discover all URLs on a website.

const result = await md.map("https://example.com", {
  maxUrls: 500,
  allowedWords: ["product", "shop"],
  blockedWords: ["blog", "careers"],
});

crawl(url, opts?) → Promise<unknown>

Crawl a website recursively and extract content from every page.

const result = await md.crawl("https://docs.example.com", {
  maxDepth: 3,
  limit: 50,
  includeOnly: ["*guide*", "*tutorial*"],
});

| Option | Type | Default | |--------|------|---------| | maxDepth | number | 2 | | limit | number | 10 | | timeout | number | 60 | | includeLinks | boolean | false | | blockedWords | string[] | [] | | includeOnly | string[] | [] | | mainContent | boolean | true | | callbackUrl | string | — | | wait | boolean | true |


extract(url, schema, opts?) → Promise<unknown>

Schema-based structured data extraction with AI.

const result = await md.extract(
  "https://shop.example.com",
  [
    { name: "title",    type: "string", active: true },
    { name: "price",    type: "float",  active: true },
    { name: "in_stock", type: "string", active: true },
  ],
  {
    extractQuery: "Product listings with title, price and availability",
    extractionScope: "whole_site",
  }
);

createSchema(query) → Promise<unknown>

Generate an extraction schema automatically from a natural language description.

const schema = await md.createSchema(
  "Extract all products from https://shop.example.com with title, price and rating"
);

promptExtract(url, prompt, opts?) → Promise<unknown>

Extract data using a natural language prompt — no schema required.

const result = await md.promptExtract(
  "https://example.com/team",
  "List all team members with their name, title and LinkedIn URL"
);

batchScrape(urls, opts?) → Promise<unknown>

Scrape multiple URLs in parallel.

const result = await md.batchScrape([
  "https://example.com/page-1",
  "https://example.com/page-2",
  "https://example.com/page-3",
]);

search(query, opts?) → Promise<unknown>

Web search with optional scraping of result pages.

const result = await md.search("open source web scraping", {
  engine: "all",       // "google" | "bing" | "duckduckgo" | "all"
  limit: 10,
  scrapeResults: true,
  lang: "en",
  country: "us",
});

rss(url, opts?) → Promise<unknown>

Generate an RSS feed from any web page.

const result = await md.rss("https://blog.example.com", {
  maxItems: 20,
  title: "My Blog Feed",
});

changeDetection(url, opts?) → Promise<unknown>

Detect and diff content changes on a page.

const result = await md.changeDetection("https://example.com/pricing", {
  includeDiff: true,
});
// result.data.changed → true/false
// result.data.diff    → diff string

deepResearch(query, urls, opts?) → Promise<unknown>

Scrape multiple pages and synthesize findings with an LLM.

const result = await md.deepResearch(
  "What are the pricing differences between these competitors?",
  [
    "https://competitor-a.com/pricing",
    "https://competitor-b.com/pricing",
  ],
  { maxTokens: 4096 }
);

agent(url, prompt, opts?) → Promise<unknown>

AI autonomous navigation agent that clicks, scrolls, and fills forms.

const result = await md.agent(
  "https://example.com",
  "Find the contact email on this website",
  { maxSteps: 15, includeScreenshots: false }
);

smartExtract(url, goal, opts?) → Promise<unknown>

Guided extraction: analyzes the site, plans interactions, and returns complete data.

const result = await md.smartExtract(
  "https://example.com/products",
  "Extract all products with name, price, and description",
  {
    outputFormat: "json",
    hints: ["Click 'Load more' button to get all items"],
  }
);

rank(keyword, domain, opts?) → Promise<unknown>

Check where a domain ranks for a keyword in search results.

const result = await md.rank("web scraping api", "scrapetechnology.com", {
  engine: "google",
  country: "us",
  depth: 100,
});
// result.data.position → e.g. 4

dataset(url, goal, opts?) → Promise<unknown>

Auto-paginate a listing page and extract all items as a structured dataset.

const result = await md.dataset(
  "https://books.toscrape.com",
  "Extract all books with title, price, rating and availability",
  { maxPages: 10, outputFormat: "json" }
);

Monitor

// Create
const sub = await md.monitorCreate(
  "https://example.com/pricing",
  "https://yourapp.com/webhook",
  { intervalMinutes: 60 }
);
const subId = (sub as { subscription_id: string }).subscription_id;

// List
const monitors = await md.monitorList();

// Delete
await md.monitorDelete(subId);

instagram(resource, target, opts?) → Promise<unknown>

Extract public Instagram data.

| resource | target | Description | |------------|----------|-------------| | "profile" | username | Profile info + recent posts | | "post" | shortcode | Single post | | "hashtag" | hashtag | Recent posts for a hashtag | | "search" | keyword | Search results |

// Profile
const profile = await md.instagram("profile", "instagram");

// Hashtag
const posts = await md.instagram("hashtag", "webdev", { limit: 30 });

// Search
const results = await md.instagram("search", "coffee shops NYC");

x(resource, target, opts?) → Promise<unknown>

Extract public X (Twitter) data.

| resource | target | Description | |------------|----------|-------------| | "profile" | username | Profile info + recent tweets | | "post" | tweet ID | Single tweet | | "search" | keyword | Search results |

const profile = await md.x("profile", "elonmusk");
const results = await md.x("search", "web scraping tools", { limit: 25 });

Job Management

// Get status
const status = await md.getJobStatus("crawl", "job-id");
// status.status → "pending" | "processing" | "completed" | "failed"

// Cancel
await md.cancelJob("agent", "job-id");

Error Handling

import { MarkUDown, MarkUDownError, RateLimitError, InsufficientCreditsError } from "markudown";

const md = new MarkUDown({ apiKey: "your-key" });

try {
  const result = await md.scrape("https://example.com");
} catch (e) {
  if (e instanceof RateLimitError) {
    console.log("Rate limit hit — wait a minute and retry");
  } else if (e instanceof InsufficientCreditsError) {
    console.log("Out of credits — upgrade at scrapetechnology.com");
  } else if (e instanceof MarkUDownError) {
    console.log(`API error ${e.statusCode}: ${e.message}`);
  }
}

Build

npm install
npm run build       # compiles to dist/
npm run typecheck   # type-check without emitting

Links