npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

n8n-nodes-olostep

v0.2.8

Published

n8n node for Olostep Web Search, Scraping and Crawling API - Search, extract and structure web data from any website

Readme

n8n-nodes-olostep

The Olostep n8n integration brings powerful web search, scraping, and crawling capabilities to n8n workflows. Build automated workflows that search, extract, and structure web data from any website without writing code.

Olostep is a web search, scraping, and crawling API that extracts structured web data from any website in real time. Perfect for automating research workflows, monitoring competitors, and collecting data at scale.

n8n is a fair-code licensed workflow automation platform.

Installation
Operations
Credentials
Workflow Template
Testing Locally
Publishing to npm
Compatibility
Resources

Installation

Follow the installation guide in the n8n community nodes documentation.

Or install directly via npm:

npm install n8n-nodes-olostep

Then restart n8n to load the new node.

Operations

Scrape Website

Extract content from a single URL. Supports multiple formats and JavaScript rendering.

Use Cases:

  • Monitor specific pages for changes
  • Extract product information from e-commerce sites
  • Gather data from news articles or blog posts
  • Pull content for content aggregation

Parameters:

  • URL to Scrape (required): The URL of the website you want to scrape
  • Output Format: Choose format (HTML, Markdown, JSON, or Plain Text) - default: Markdown
  • Country Code: Optional country code (e.g., US, GB, CA) for location-specific scraping
  • Wait Before Scraping: Wait time in milliseconds before scraping (0-10000) - useful for dynamic content
  • Parser: Optional parser ID for specialized extraction (e.g., @olostep/amazon-product)

Output Fields:

  • id: Scrape ID
  • url: Scraped URL
  • markdown_content: Markdown formatted content
  • html_content: HTML formatted content
  • json_content: JSON formatted content
  • text_content: Plain text content
  • status: Status of the scrape
  • timestamp: Timestamp of the scrape
  • screenshot_hosted_url: URL to screenshot (if available)
  • page_metadata: Page metadata

Search

Search the Web for a given query and get structured results (non-AI, parser-based search results).

Use Cases:

  • Automated research workflows
  • Lead discovery and enrichment
  • Competitive analysis
  • Content research

Parameters:

  • Query (required): Search query

Output: Returns structured search results as JSON.

Answers (AI)

Search the web and return AI-powered answers in the JSON structure you want, with sources and citations.

Use Cases:

  • Enrich spreadsheets or records with web-sourced facts
  • Ground AI applications on real-world data and sources
  • Research tasks that require verified outputs with citations

Parameters:

  • Task (required): Question or task to answer using web data

  • JSON Schema (Optional): JSON schema/object or a short description of the desired output shape

    Examples:

    • Object schema: { "book_title": "", "author": "", "release_date": "" }
    • Description string: "Return a list of the top 5 competitors with name and homepage URL"

Output Fields:

  • answer_id: Answer ID
  • object: Object type
  • task: Original task
  • result: Parsed JSON if json was provided; otherwise the answer object
  • sources: Array of source URLs used
  • created: Creation timestamp

Batch Scrape URLs

Scrape up to 10k urls at the same time. Perfect for large-scale data extraction.

Use Cases:

  • Scrape entire product catalogs
  • Extract data from multiple search results
  • Process lists of URLs from spreadsheets
  • Bulk content extraction

Parameters:

  • URLs to Scrape (required): JSON array of objects with url and custom_id fields
    • Example: [{"url":"https://example.com","custom_id":"site1"}]
  • Output Format: Choose format for all URLs - default: Markdown
  • Country Code: Optional country code for location-specific scraping
  • Wait Before Scraping: Wait time in milliseconds before scraping each URL
  • Parser: Optional parser ID for specialized extraction

Output Fields:

  • batch_id: Batch ID (use this to retrieve results later)
  • status: Status of the batch
  • total_urls: Total number of URLs in the batch
  • created_at: Creation timestamp
  • formats: Requested format
  • country: Country code used
  • parser: Parser used
  • urls: Array of URLs in the batch

Create Crawl

Get the content of subpages of a URL. Autonomously discover and scrape entire websites by following links. Perfect for documentation sites, blogs, and content repositories.

Use Cases:

  • Crawl and archive entire documentation sites
  • Extract all blog posts from a website
  • Build knowledge bases from web content
  • Monitor website structure changes

Parameters:

  • Start URL (required): Starting URL for the crawl
  • Maximum Pages: Maximum number of pages to crawl (default: 10)
  • Follow Links: Whether to follow links found on pages (default: true)
  • Output Format: Format for scraped content - default: Markdown
  • Country Code: Optional country code for location-specific crawling
  • Parser: Optional parser ID for specialized content extraction

Output Fields:

  • crawl_id: Crawl ID (use this to retrieve results later)
  • object: Object type
  • status: Status of the crawl
  • start_url: Starting URL
  • max_pages: Maximum pages
  • follow_links: Whether links are followed
  • created: Creation timestamp
  • formats: Formats requested
  • country: Country code used
  • parser: Parser used

Create Map

Get all URLs on a website. Extract all URLs from a website for content discovery and site structure analysis.

Use Cases:

  • Build sitemaps and site structure diagrams
  • Discover all pages before batch scraping
  • Find broken or missing pages
  • SEO audits and analysis

Parameters:

  • Website URL (required): Website URL to extract links from
  • Search Query: Optional search query to filter URLs (e.g., "blog")
  • Top N URLs: Limit the number of URLs returned
  • Include URL Patterns: Glob patterns to include specific paths (e.g., "/blog/**")
  • Exclude URL Patterns: Glob patterns to exclude specific paths (e.g., "/admin/**")

Output Fields:

  • map_id: Map ID
  • object: Object type
  • url: Website URL
  • total_urls: Total URLs found
  • urls: Array of discovered URLs
  • search_query: Search query used
  • top_n: Top N limit

Credentials

To use this node, you need to authenticate with the Olostep Scrape API.

  1. Sign up for an account on olostep.com to get an API key.
  2. Get your API key from the Olostep Dashboard.
  3. Add a new credential in n8n for the Olostep Scrape API.
  4. Enter your API key in the 'API Key' field.

Workflow Template

A sample workflow template is included in workflow-template.json. To use it:

  1. Open n8n
  2. Click on "Workflows" in the sidebar
  3. Click the three dots menu and select "Import from File"
  4. Select the workflow-template.json file
  5. Configure the Olostep credentials
  6. Customize the workflow as needed

The template includes a basic workflow that scrapes a website using the Olostep Scrape Website operation.

Testing Locally

To test your node locally before publishing:

Prerequisites

  • Node.js version 20.15 or higher
  • npm installed

Steps

  1. Build the node:

    npm run build
  2. Link the node locally:

    npm link
  3. Install n8n globally (if not already installed):

    npm install n8n -g
  4. Link the node to n8n:

    # Find your n8n custom nodes directory (usually ~/.n8n/custom/)
    # Or if using n8n locally, navigate to the n8n installation directory
    cd ~/.n8n/custom
    npm link n8n-nodes-olostep
  5. Start n8n:

    n8n start
  6. Test the node:

    • Open n8n in your browser (usually http://localhost:5678)
    • Create a new workflow
    • Add the "Olostep Scrape" node
    • Configure it with your API credentials
    • Test each operation to ensure they work correctly

Running Linter

Before publishing, make sure your code passes linting:

npm run lint

If there are any issues, fix them:

npm run lintfix

Publishing to npm

Once your node is tested and ready:

Prerequisites

  • npm account (create one at npmjs.com if needed)
  • Node.js version 20.15 or higher

Steps

  1. Ensure package.json is correct:

    • Verify version number
    • Check that all required fields are filled
    • Ensure the package name is n8n-nodes-olostep
  2. Build the package:

    npm run build
  3. Run prepublish checks:

    npm run prepublishOnly

    This will build and lint your code.

  4. Login to npm:

    npm login

    Enter your npm username, password, and email when prompted.

  5. Publish to npm:

    npm publish
  6. Verify publication:

    • Visit https://www.npmjs.com/package/n8n-nodes-olostep
    • Check that your latest version is published

Version Management

When making updates:

  1. Update the version in package.json following semantic versioning:

    • Patch (1.0.0 → 1.0.1): Bug fixes
    • Minor (1.0.0 → 1.1.0): New features, backward compatible
    • Major (1.0.0 → 2.0.0): Breaking changes
  2. Update the CHANGELOG.md (if you have one)

  3. Build and publish:

    npm run build
    npm publish

Compatibility

This node is tested against n8n version 1.x and requires Node.js version 20.15 or higher.

Specialized Parsers

Olostep provides pre-built parsers for popular websites and tasks. Use them with the Parser parameter:

  • @olostep/amazon-product - Extract product details, prices, reviews, images, variants
  • @olostep/google-search - Extract search results, titles, snippets, URLs
  • @olostep/google-maps - Extract business info, reviews, ratings, location
  • @olostep/extract-emails - Extract emails from pages and contact sections
  • @olostep/extract-socials - Extract social profile links (X/Twitter, GitHub, etc.)
  • @olostep/extract-calendars - Extract calendar links (Google Calendar, ICS)

Resources

Support

Need help with the n8n integration?