npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

olyptik

v0.1.18

Published

Official TypeScript SDK for Olyptik API

Readme

Olyptik - Node.js SDK

Get started with the Olyptik Node.js/TypeScript SDK for web crawling and content extraction

Installation

Install the SDK using npm:

npm install olyptik

Configuration

First, you'll need to initialize the SDK with your API key - you can get it from the settings page. You can either pass it directly or use environment variables.

import Olyptik from 'olyptik';

// Initialize with API key
const client = new Olyptik({ apiKey: 'your-api-key' });

Usage

Starting a Crawl

The SDK allows you to start web crawls with various configuration options:

Minimal settings crawl:

const crawl = await client.runCrawl({
  startUrl: 'https://example.com',
  maxResults: 10
});

Full example:

const crawl = await client.runCrawl({
  startUrl: 'https://example.com',
  maxResults: 100,
  maxDepth: 10,
  includeLinks: true,
  useSitemap: false,
  entireWebsite: false, 
  excludeNonMainTags: true,
  deduplicateContent: true,
  extraction: "",
  timeout: 60,
  engineType: "auto",
  useStaticIps: false
});

Get crawl

Retrieve a crawl - the response will be a crawl object

const crawl = await client.getCrawl(crawl.id);

Query crawls

const result: PaginationResult<Crawl> = await olyptik.queryCrawls({
    startUrls: ['https://example.com'],
    status: [CrawlStatus.SUCCEEDED],
    page: 0,
});

console.log("Crawls: ", result.results);
console.log("Page: ", result.page);
console.log("Total pages: ", result.totalPages);
console.log("Count of items per page: ", result.limit);
console.log("Total matched crawls: ", result.totalResults);

Getting Crawl Results

Retrieve the results of your crawl using the crawl ID. The results are paginated, and you can specify the page number and limit per page.

const limit = 50;
const page = 0;
const results: PaginationResult<CrawlResult> = await client.getCrawlResults(crawl.id, page, limit);

Abort a crawl

const abortedCrawl: Crawl = await client.abortCrawl(crawl.id);

Get crawl logs

Retrieve logs for a specific crawl to monitor its progress and debug issues:

const page = 1;
const limit = 1200;
const logs: PaginationResult<CrawlLog> = await client.getCrawlLogs(crawl.id, page, limit);

Scrape multiple URLs

Scrape up to 30 URLs at once without following links:

const scrapeResponse: ScrapeResponse = await client.scrape({
  urls: ['https://example.com', 'https://example.com/about'],
  includeLinks: true,
  excludeNonMainTags: true,
  deduplicateContent: true,
  extraction: "",
  timeout: 5,
  engineType: "auto",
  useStaticIps: false
});

for (const result of scrapeResponse.results) {
  if (result.isSuccess) {
    console.log(`URL: ${result.url}`);
    console.log(`Title: ${result.title}`);
    console.log(`Links found: ${result.links.length}`);
  } else {
    console.log(`Failed to scrape ${result.url}: ${result.errorMessage}`);
  }
}

Objects

RunCrawlPayload

You must provide at least one of the following: maxResults, useSitemap, or entireWebsite.

| Property | Type | Required | Default | Description | |--------|------|----------|---------|-------------| | startUrl | string | ✅ | - | The URL to start crawling from | | maxResults | number | ❌ | - | Maximum number of results to collect (1-5,000) | | useSitemap | boolean | ❌ | false | Whether to use sitemap.xml to crawl the website | | entireWebsite | boolean | ❌ | false | Whether to use sitemap.xml and all found links to crawl the website | | maxDepth | number | ❌ | 10 | Maximum depth of pages to crawl (1-100) | | includeLinks | boolean | ❌ | true | Whether to include links in the crawl results' markdown | | excludeNonMainTags | boolean | ❌ | true | Whether to exclude non-main HTML tags (header, footer, aside, etc.) from the crawl results | | deduplicateContent | boolean | ❌ | true | Remove duplicate content from markdown that appears on multiple pages | | extraction | string | ❌ | "" | Instructions defining how the AI should extract specific content from the crawl results | | timeout | number | ❌ | 60 | Timeout duration in minutes | | engineType | string | ❌ | "auto" | The engine to use: "auto", "cheerio" (fast, static sites), "playwright" (dynamic sites) | | useStaticIps | boolean | ❌ | false | Whether to use static IPs for the crawl |

Crawl

| Property | Type | Description | |-------|------|-------------| | id | string | Unique crawl identifier | | status | string | Current status ("RUNNING", "SUCCEEDED", "FAILED", "TIMED_OUT", "ABORTED", "ERROR") | | startUrls | string[] | Starting URLs | | includeLinks | boolean | Whether links are included | | maxDepth | number | Maximum crawl depth | | maxResults | number | Maximum number of results | | teamId | string | Team identifier | | createdAt | string | Creation timestamp | | completedAt | string | null | Completion timestamp | | durationInSeconds | number | Total duration | | totalPages | number | Number of results found | | useSitemap | boolean | Whether sitemap was used | | entireWebsite | boolean | Whether to use both sitemap and all found links | | deduplicateContent | boolean | Remove duplicate content from markdown that appears on multiple pages | | extraction | string | ❌ | "" | Instructions defining how the AI should extract specific content from the crawl results | | excludeNonMainTags | boolean | Whether non-main HTML tags were excluded | | timeout | number | The timeout of the crawl in minutes |

CrawlResult

Each crawl result includes:

| Property | Type | Description | |-------|------|-------------| | id | string | Unique identifier for the page result | | crawlId | string | Unique identifier for the crawl | | url | string | The crawled URL | | title | string | Page title extracted from the HTML | | markdown | string | Extracted content in markdown format | | depthOfUrl | number | How deep this URL was in the crawl (0 = start URL) | | isSuccess | boolean | Whether the crawl was successful | | error | string | Error message if the crawl failed | | createdAt | string | ISO timestamp when the result was created |

CrawlLog

Each crawl log includes:

| Property | Type | Description | |-------|------|-------------| | id | string | Unique identifier for the log entry | | message | string | Log message | | level | string | Log level: "info", "debug", "warn", or "error" | | description | string | Detailed description of the log entry | | crawlId | string | Unique identifier for the crawl | | teamId | string | null | Team identifier | | data | object | null | Additional data associated with the log entry | | createdAt | Date | Timestamp when the log was created |

StartScrapePayload

| Property | Type | Required | Default | Description | |-------|------|----------|---------|-------------| | urls | string[] | ✅ | - | Array of URLs to scrape (max 30 URLs) | | includeLinks | boolean | ❌ | true | Whether to include links in the scrape results' markdown | | excludeNonMainTags | boolean | ❌ | true | Whether to exclude non-main tags from the scrape results' markdown | | deduplicateContent | boolean | ❌ | true | Whether to remove duplicate text fragments that appeared in multiple scraped pages | | extraction | string | ❌ | "" | Instructions defining how the AI should extract specific content from the scrape results | | timeout | number | ❌ | 5 | Timeout duration in minutes | | engineType | string | ❌ | "auto" | The engine to use: "auto", "cheerio" (fast, static sites), "playwright" (dynamic sites) | | useStaticIps | boolean | ❌ | false | Whether to use static IPs for the scrape |

ScrapeResponse

The response from a scrape operation:

| Property | Type | Description | |-------|------|-------------| | id | string | Unique scrape identifier | | teamId | string | Team identifier | | projectId | string | Project identifier | | results | UrlResult[] | Array of scrape results | | timeout | number | Timeout in minutes | | origin | string | Origin of the scrape ("api" or "web") | | createdAt | Date | Creation timestamp | | updatedAt | Date | Last update timestamp |

UrlResult

Each URL scrape result includes:

| Property | Type | Description | |-------|------|-------------| | url | string | The URL that was scraped | | isSuccess | boolean | Whether the scrape was successful | | title | string | Page title | | markdown | string | Extracted content in markdown format | | links | string[] | Links found on the page | | duplicatesRemovedCount | number | Number of duplicate content blocks removed | | errorCode | number | Error code if the scrape failed | | errorMessage | string | Error message if the scrape failed |

Error Handling

The SDK throws errors for various scenarios. Always wrap your calls in try-catch blocks:

try {
  const crawl = await client.runCrawl({
    startUrl: 'https://example.com',
    maxResults: 10
  });
} catch (error) {
  if (e instanceof AxiosError) {
    // API returned an error response
    console.error('API Error:', error.response.status, error.response.data);
  }
}