npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@olostep/inngest

v1.0.0

Published

Official Olostep integration for Inngest - Build reliable web scraping workflows with durable functions. Search, extract, and structure web data from any website.

Readme

@olostep/inngest

Official Olostep integration for Inngest - Build reliable, fault-tolerant web scraping workflows with durable functions.

Olostep is a web search, scraping, and crawling API that extracts structured web data from any website in real time. Perfect for automating research workflows, monitoring competitors, and collecting data at scale.

Inngest is a platform for building reliable background jobs and workflows with automatic retries, scheduling, and observability.

npm version License: MIT

Installation
Quick Start
Operations
Credentials
Examples
Specialized Parsers
Resources

Installation

npm install @olostep/inngest inngest

Or with yarn/pnpm:

yarn add @olostep/inngest inngest
pnpm add @olostep/inngest inngest

Quick Start

import { Inngest } from 'inngest';
import { createOlostepClient } from '@olostep/inngest';

// Initialize clients
const inngest = new Inngest({ id: 'my-app' });
const olostep = createOlostepClient({
  apiKey: process.env.OLOSTEP_API_KEY!,
});

// Create a durable web scraping workflow
export const scrapeWorkflow = inngest.createFunction(
  { id: 'scrape-website' },
  { event: 'scrape/start' },
  async ({ event, step }) => {
    // Each step.run is automatically retried on failure
    const result = await step.run('scrape-page', () =>
      olostep.scrape({
        url: event.data.url,
        formats: ['markdown'],
      })
    );

    return {
      url: result.url,
      content: result.markdownContent,
      metadata: result.pageMetadata,
    };
  }
);

Operations

Scrape Website

Extract content from a single URL. Supports multiple formats and JavaScript rendering.

Use Cases:

  • Monitor specific pages for changes
  • Extract product information from e-commerce sites
  • Gather data from news articles or blog posts
  • Pull content for content aggregation

Parameters:

  • url (required): The URL of the website you want to scrape
  • formats: Output formats (html, markdown, json, text) - default: ['markdown']
  • country: Country code (e.g., US, GB, CA) for location-specific scraping
  • waitBeforeScraping: Wait time in milliseconds before scraping (0-10000)
  • parser: Parser ID for specialized extraction (e.g., @olostep/amazon-product)

Example:

const result = await step.run('scrape', () =>
  olostep.scrape({
    url: 'https://example.com',
    formats: ['markdown', 'html'],
    country: 'US',
  })
);
// Returns: { id, url, markdownContent, htmlContent, pageMetadata, ... }

Batch Scrape URLs

Scrape up to 100,000 URLs in parallel. Perfect for large-scale data extraction.

Use Cases:

  • Scrape entire product catalogs
  • Extract data from multiple search results
  • Process lists of URLs from spreadsheets
  • Bulk content extraction

Parameters:

  • items (required): Array of { url, customId? } objects
  • formats: Output formats - default: ['markdown']
  • country: Country code for location-specific scraping
  • parser: Parser ID for specialized extraction

Example:

// Create batch job
const batch = await step.run('create-batch', () =>
  olostep.batch.create({
    items: [
      { url: 'https://example1.com', customId: 'page-1' },
      { url: 'https://example2.com', customId: 'page-2' },
    ],
    formats: ['markdown'],
  })
);

// Wait for processing
await step.sleep('wait', '2m');

// Get results
const results = await step.run('get-results', () =>
  olostep.batch.get(batch.id)
);

Create Crawl

Autonomously discover and scrape entire websites by following links.

Use Cases:

  • Crawl and archive entire documentation sites
  • Extract all blog posts from a website
  • Build knowledge bases from web content
  • Monitor website structure changes

Parameters:

  • startUrl (required): Starting URL for the crawl
  • maxPages: Maximum number of pages to crawl (default: 10)
  • followLinks: Whether to follow links (default: true)
  • formats: Output formats - default: ['markdown']
  • includeUrls: Glob patterns to include (e.g., ['/docs/**'])
  • excludeUrls: Glob patterns to exclude (e.g., ['/admin/**'])

Example:

const crawl = await step.run('start-crawl', () =>
  olostep.crawl.create({
    startUrl: 'https://docs.example.com',
    maxPages: 100,
    includeUrls: ['/docs/**'],
  })
);

Create Map

Extract all URLs from a website for content discovery and site structure analysis.

Use Cases:

  • Build sitemaps and site structure diagrams
  • Discover all pages before batch scraping
  • Find broken or missing pages
  • SEO audits and analysis

Parameters:

  • url (required): Website URL to extract links from
  • searchQuery: Optional search query to filter URLs
  • topN: Limit the number of URLs returned
  • includeUrls: Glob patterns to include
  • excludeUrls: Glob patterns to exclude

Example:

const siteMap = await step.run('discover-urls', () =>
  olostep.map({
    url: 'https://example.com',
    includeUrls: ['/blog/**'],
    topN: 100,
  })
);
// Returns: { id, url, totalUrls, urls: string[] }

AI-Powered Answers

Search the web and get AI-powered answers with sources and citations.

Use Cases:

  • Enrich data with web-sourced facts
  • Ground AI applications on real-world data
  • Research tasks with verified outputs
  • Competitive intelligence

Parameters:

  • task (required): Question or task to answer
  • jsonSchema: JSON schema for structured output

Example:

const answer = await step.run('get-answer', () =>
  olostep.answer({
    task: 'Who is the CEO of Anthropic?',
    jsonSchema: {
      ceo_name: '',
      founded_year: '',
      headquarters: '',
    },
  })
);
// Returns: { id, task, answer: { ceo_name: 'Dario Amodei', ... }, sources }

Web Search

Search the web using Google Search and get structured results.

Use Cases:

  • Automated research workflows
  • Lead discovery and enrichment
  • Competitive analysis
  • Content research

Parameters:

  • query (required): Search query
  • country: Country code for geo-specific results
  • numResults: Number of results to return

Example:

const results = await step.run('search', () =>
  olostep.search({
    query: 'best web scraping APIs 2024',
    country: 'US',
    numResults: 10,
  })
);
// Returns: { query, results: [{ title, url, snippet, position }] }

Credentials

To use this package, you need an Olostep API key:

  1. Sign up for an account at olostep.com
  2. Get your API key from the Olostep Dashboard
  3. Set it as an environment variable:
export OLOSTEP_API_KEY=your_api_key_here

Or pass it directly to the client:

const olostep = createOlostepClient({
  apiKey: 'your_api_key_here',
});

Examples

Multi-Step Research Workflow

export const researchWorkflow = inngest.createFunction(
  { id: 'web-research' },
  { event: 'research/start' },
  async ({ event, step }) => {
    // Step 1: Discover pages
    const siteMap = await step.run('discover-pages', () =>
      olostep.map({
        url: event.data.websiteUrl,
        includeUrls: ['/blog/**'],
        topN: 50,
      })
    );

    // Step 2: Batch scrape discovered URLs
    const batch = await step.run('create-batch', () =>
      olostep.batch.create({
        items: siteMap.urls.map((url, idx) => ({
          url,
          customId: `page-${idx}`,
        })),
        formats: ['markdown'],
      })
    );

    // Step 3: Wait for processing
    await step.sleep('wait', '3m');

    // Step 4: Get results
    const results = await step.run('get-results', () =>
      olostep.batch.get(batch.id)
    );

    return {
      pagesFound: siteMap.totalUrls,
      pagesScraped: results.completedItems,
    };
  }
);

Scheduled Monitoring

export const dailyMonitor = inngest.createFunction(
  {
    id: 'daily-monitor',
    cron: '0 9 * * *', // Every day at 9am
  },
  async ({ step }) => {
    const result = await step.run('check-competitor', () =>
      olostep.scrape({
        url: 'https://competitor.com/pricing',
        formats: ['markdown'],
      })
    );

    // Process and alert on changes...
    return { scraped: true };
  }
);

Using Middleware

Inject the Olostep client into all your Inngest functions:

import { Inngest } from 'inngest';
import { createOlostepMiddleware } from '@olostep/inngest';

const inngest = new Inngest({
  id: 'my-app',
  middleware: [
    createOlostepMiddleware({
      apiKey: process.env.OLOSTEP_API_KEY!,
    }),
  ],
});

// Now ctx.olostep is available in all functions
export const myFunction = inngest.createFunction(
  { id: 'my-function' },
  { event: 'app/event' },
  async ({ step, ctx }) => {
    const result = await step.run('scrape', () =>
      ctx.olostep.scrape({ url: event.data.url })
    );
    return result;
  }
);

Specialized Parsers

Olostep provides pre-built parsers for popular websites. Use them with the parser parameter:

  • @olostep/google-search - Extract search results, titles, snippets, URLs
  • @olostep/google-maps - Extract business info, reviews, ratings, location
  • @olostep/amazon-product - Extract product details, prices, reviews, images
  • @olostep/linkedin-profile - Extract LinkedIn profile data
  • @olostep/extract-emails - Extract emails from pages
  • @olostep/extract-socials - Extract social profile links

Example:

const product = await olostep.scrape({
  url: 'https://amazon.com/dp/PRODUCT_ID',
  parser: '@olostep/amazon-product',
  formats: ['json'],
});

Error Handling

The client throws typed errors for different scenarios:

import { OlostepError, OlostepRateLimitError, OlostepAuthError } from '@olostep/inngest';

try {
  await olostep.scrape({ url: 'https://example.com' });
} catch (error) {
  if (error instanceof OlostepRateLimitError) {
    // Inngest will automatically retry with backoff
  } else if (error instanceof OlostepAuthError) {
    // Invalid API key
  } else if (error instanceof OlostepError) {
    console.log(error.code, error.statusCode);
  }
}

Compatibility

  • Node.js: >= 18.0.0
  • Inngest: >= 3.0.0
  • TypeScript: Full type support included

Resources

Support

Need help with the Inngest integration?

License

MIT © Olostep