n8n-nodes-bozonx-news-crawler-microservice
v2.3.0
Published
n8n nodes for BozonX News Crawler microservice with flexible authentication (None, Basic Auth, Bearer Token)
Downloads
1,775
Maintainers
Readme
n8n nodes for BozonX News Crawler microservice
Quickly orchestrate News Crawler workflows inside n8n. The package ships three nodes that cover payload preparation, job dispatch, and data retrieval.
Authentication
The package uses News Crawler API credentials specific to this microservice. Configure authentication in n8n credentials:
- Base URL (required): Full API URL including path (e.g.,
https://news-crawler.example.com/api/v1) - Authentication: Choose from three options:
- None (default): No authentication required
- Basic Auth: Requires username and password
- Bearer Token: Requires API token
All nodes in this package use these credentials to connect to your News Crawler instance.
Available nodes
News Crawler Request (All-in-One)
- Consolidated functionality: Handles Batch creation, listing, Source listing, and Data retrieval in a single node.
- Operations:
- Batch: Create: Replaces
News Crawler Start BatchandParams(supports both UI Builder and raw JSON). - Batch: Get All:
GET /batchesto list batches. - Source: Get All:
GET /sourcesto list available sources. - Data: Get:
GET /datato retrieve parsed items (replacesNews Crawler Get Data).
- Batch: Create: Replaces
News Crawler Params
- Collect sources from your registry or configure custom RSS / page crawlers.
- Validates input and builds a payload compatible with
POST /batches. - Optional webhook section adds delivery settings without leaving the node.
- Advanced options (fingerprint, locale, time zone) appear contextually, keeping the form compact.
News Crawler Start Batch
- Sends the payload from the previous node to the microservice gateway.
- Accepts shared credentials (
Gateway URL, optional base path, API key headers, etc.). - Choose between referencing the previous node output via expression (
{{$json}}) or pasting JSON manually. - Returns the batch metadata from the API, so you can branch on status or batch ID.
News Crawler Get Data
- Fetches parsed items through
GET /datausing the same gateway credentials. - Minimal configuration: select sources (CSV), optional
sourceLimit, and optional date filters:fromDate– filter by item publication date (from original source)fromSavedAt– filter by when the item was saved to the dataset- Note:
fromDateandfromSavedAtare mutually exclusive
- Output JSON follows the API envelope (
sources,sourceLimit, optionalfromDate/fromSavedAt) and exposes a flatresultsarray of items for downstream automations (e.g., notifier, storage).
Build a workflow in minutes
- Configure credentials – Create a new "News Crawler API" credential with your microservice URL and authentication method (None, Basic Auth, or Bearer Token).
- Prepare tasks – Drop the Params node, pick the task kind, and fill in the required fields. Use the preview to confirm validation.
- Send the batch – Connect it to Start Batch or use the all-in-one Request node, select your credentials, and point the payload field to
{{$json}}from the Params node. - Process results – Either listen for your webhook or periodically poll the Get Data node and hand the items to the next step.
Pro tip: wrap the Start Batch node with If or Wait nodes to react to failures or schedule retries.
Helpful UI behaviors
- Task types – Switching the
Kindtoggles only the relevant form groups (registry overrides, RSS selectors, page selectors). - RSS extraction – For RSS tasks you can override feed fields for link, title, description, date, and tags, matching the backend
extract*options. - Locale & timezone – Custom kinds expose locale/timezone overrides so Playwright and parsers match target sites.
- Fingerprinting – Enable fingerprint generation to rotate headers; browser/device lists accept CSV strings.
- Webhook card – Enter a URL to unlock custom headers (JSON), retry overrides, and per-webhook timeout (seconds via
timeoutSecs); leave blank if you only poll data. - Legacy fields – The service no longer accepts the
scraperfield. Use themodeselector instead.
Development quick start
npm install
npm run devThe dev script runs n8n with hot reload so you can iterate on node UX quickly.
