n8n-nodes-olostep
v0.2.8
Published
n8n node for Olostep Web Search, Scraping and Crawling API - Search, extract and structure web data from any website
Keywords
Readme
n8n-nodes-olostep
The Olostep n8n integration brings powerful web search, scraping, and crawling capabilities to n8n workflows. Build automated workflows that search, extract, and structure web data from any website without writing code.
Olostep is a web search, scraping, and crawling API that extracts structured web data from any website in real time. Perfect for automating research workflows, monitoring competitors, and collecting data at scale.
n8n is a fair-code licensed workflow automation platform.
Installation
Operations
Credentials
Workflow Template
Testing Locally
Publishing to npm
Compatibility
Resources
Installation
Follow the installation guide in the n8n community nodes documentation.
Or install directly via npm:
npm install n8n-nodes-olostepThen restart n8n to load the new node.
Operations
Scrape Website
Extract content from a single URL. Supports multiple formats and JavaScript rendering.
Use Cases:
- Monitor specific pages for changes
- Extract product information from e-commerce sites
- Gather data from news articles or blog posts
- Pull content for content aggregation
Parameters:
- URL to Scrape (required): The URL of the website you want to scrape
- Output Format: Choose format (HTML, Markdown, JSON, or Plain Text) - default: Markdown
- Country Code: Optional country code (e.g., US, GB, CA) for location-specific scraping
- Wait Before Scraping: Wait time in milliseconds before scraping (0-10000) - useful for dynamic content
- Parser: Optional parser ID for specialized extraction (e.g., @olostep/amazon-product)
Output Fields:
id: Scrape IDurl: Scraped URLmarkdown_content: Markdown formatted contenthtml_content: HTML formatted contentjson_content: JSON formatted contenttext_content: Plain text contentstatus: Status of the scrapetimestamp: Timestamp of the scrapescreenshot_hosted_url: URL to screenshot (if available)page_metadata: Page metadata
Search
Search the Web for a given query and get structured results (non-AI, parser-based search results).
Use Cases:
- Automated research workflows
- Lead discovery and enrichment
- Competitive analysis
- Content research
Parameters:
- Query (required): Search query
Output: Returns structured search results as JSON.
Answers (AI)
Search the web and return AI-powered answers in the JSON structure you want, with sources and citations.
Use Cases:
- Enrich spreadsheets or records with web-sourced facts
- Ground AI applications on real-world data and sources
- Research tasks that require verified outputs with citations
Parameters:
Task (required): Question or task to answer using web data
JSON Schema (Optional): JSON schema/object or a short description of the desired output shape
Examples:
- Object schema:
{ "book_title": "", "author": "", "release_date": "" } - Description string:
"Return a list of the top 5 competitors with name and homepage URL"
- Object schema:
Output Fields:
answer_id: Answer IDobject: Object typetask: Original taskresult: Parsed JSON ifjsonwas provided; otherwise the answer objectsources: Array of source URLs usedcreated: Creation timestamp
Batch Scrape URLs
Scrape up to 10k urls at the same time. Perfect for large-scale data extraction.
Use Cases:
- Scrape entire product catalogs
- Extract data from multiple search results
- Process lists of URLs from spreadsheets
- Bulk content extraction
Parameters:
- URLs to Scrape (required): JSON array of objects with
urlandcustom_idfields- Example:
[{"url":"https://example.com","custom_id":"site1"}]
- Example:
- Output Format: Choose format for all URLs - default: Markdown
- Country Code: Optional country code for location-specific scraping
- Wait Before Scraping: Wait time in milliseconds before scraping each URL
- Parser: Optional parser ID for specialized extraction
Output Fields:
batch_id: Batch ID (use this to retrieve results later)status: Status of the batchtotal_urls: Total number of URLs in the batchcreated_at: Creation timestampformats: Requested formatcountry: Country code usedparser: Parser usedurls: Array of URLs in the batch
Create Crawl
Get the content of subpages of a URL. Autonomously discover and scrape entire websites by following links. Perfect for documentation sites, blogs, and content repositories.
Use Cases:
- Crawl and archive entire documentation sites
- Extract all blog posts from a website
- Build knowledge bases from web content
- Monitor website structure changes
Parameters:
- Start URL (required): Starting URL for the crawl
- Maximum Pages: Maximum number of pages to crawl (default: 10)
- Follow Links: Whether to follow links found on pages (default: true)
- Output Format: Format for scraped content - default: Markdown
- Country Code: Optional country code for location-specific crawling
- Parser: Optional parser ID for specialized content extraction
Output Fields:
crawl_id: Crawl ID (use this to retrieve results later)object: Object typestatus: Status of the crawlstart_url: Starting URLmax_pages: Maximum pagesfollow_links: Whether links are followedcreated: Creation timestampformats: Formats requestedcountry: Country code usedparser: Parser used
Create Map
Get all URLs on a website. Extract all URLs from a website for content discovery and site structure analysis.
Use Cases:
- Build sitemaps and site structure diagrams
- Discover all pages before batch scraping
- Find broken or missing pages
- SEO audits and analysis
Parameters:
- Website URL (required): Website URL to extract links from
- Search Query: Optional search query to filter URLs (e.g., "blog")
- Top N URLs: Limit the number of URLs returned
- Include URL Patterns: Glob patterns to include specific paths (e.g., "/blog/**")
- Exclude URL Patterns: Glob patterns to exclude specific paths (e.g., "/admin/**")
Output Fields:
map_id: Map IDobject: Object typeurl: Website URLtotal_urls: Total URLs foundurls: Array of discovered URLssearch_query: Search query usedtop_n: Top N limit
Credentials
To use this node, you need to authenticate with the Olostep Scrape API.
- Sign up for an account on olostep.com to get an API key.
- Get your API key from the Olostep Dashboard.
- Add a new credential in n8n for the Olostep Scrape API.
- Enter your API key in the 'API Key' field.
Workflow Template
A sample workflow template is included in workflow-template.json. To use it:
- Open n8n
- Click on "Workflows" in the sidebar
- Click the three dots menu and select "Import from File"
- Select the
workflow-template.jsonfile - Configure the Olostep credentials
- Customize the workflow as needed
The template includes a basic workflow that scrapes a website using the Olostep Scrape Website operation.
Testing Locally
To test your node locally before publishing:
Prerequisites
- Node.js version 20.15 or higher
- npm installed
Steps
Build the node:
npm run buildLink the node locally:
npm linkInstall n8n globally (if not already installed):
npm install n8n -gLink the node to n8n:
# Find your n8n custom nodes directory (usually ~/.n8n/custom/) # Or if using n8n locally, navigate to the n8n installation directory cd ~/.n8n/custom npm link n8n-nodes-olostepStart n8n:
n8n startTest the node:
- Open n8n in your browser (usually http://localhost:5678)
- Create a new workflow
- Add the "Olostep Scrape" node
- Configure it with your API credentials
- Test each operation to ensure they work correctly
Running Linter
Before publishing, make sure your code passes linting:
npm run lintIf there are any issues, fix them:
npm run lintfixPublishing to npm
Once your node is tested and ready:
Prerequisites
- npm account (create one at npmjs.com if needed)
- Node.js version 20.15 or higher
Steps
Ensure package.json is correct:
- Verify version number
- Check that all required fields are filled
- Ensure the package name is
n8n-nodes-olostep
Build the package:
npm run buildRun prepublish checks:
npm run prepublishOnlyThis will build and lint your code.
Login to npm:
npm loginEnter your npm username, password, and email when prompted.
Publish to npm:
npm publishVerify publication:
- Visit https://www.npmjs.com/package/n8n-nodes-olostep
- Check that your latest version is published
Version Management
When making updates:
Update the version in
package.jsonfollowing semantic versioning:- Patch (1.0.0 → 1.0.1): Bug fixes
- Minor (1.0.0 → 1.1.0): New features, backward compatible
- Major (1.0.0 → 2.0.0): Breaking changes
Update the CHANGELOG.md (if you have one)
Build and publish:
npm run build npm publish
Compatibility
This node is tested against n8n version 1.x and requires Node.js version 20.15 or higher.
Specialized Parsers
Olostep provides pre-built parsers for popular websites and tasks. Use them with the Parser parameter:
- @olostep/amazon-product - Extract product details, prices, reviews, images, variants
- @olostep/google-search - Extract search results, titles, snippets, URLs
- @olostep/google-maps - Extract business info, reviews, ratings, location
- @olostep/extract-emails - Extract emails from pages and contact sections
- @olostep/extract-socials - Extract social profile links (X/Twitter, GitHub, etc.)
- @olostep/extract-calendars - Extract calendar links (Google Calendar, ICS)
Resources
- Olostep API Documentation
- Olostep Dashboard - Get your API key and manage usage
- n8n community nodes documentation
- n8n Node Development Guide
Support
Need help with the n8n integration?
- Documentation: docs.olostep.com
- Support Email: [email protected]
- Website: olostep.com
