gologin-webunlocker
v0.3.1
Published
Compatibility npm package for the GoLogin Scraping API
Maintainers
Readme
GoLogin Scraping API SDK (TypeScript)
Minimal Node.js SDK and CLI for stateless page retrieval through the GoLogin Scraping API.
The backend endpoint is:
GET https://parsing.webunlocker.gologin.com/v1/scrape?url={encoded_url}
Authentication is sent via header: apikey: <API_KEY>.
The backend response is raw HTML/text.
Best fit:
- bot-protected HTML pages
- public JSON/data endpoints hidden behind basic protection
- simple subprocess-style usage without a browser runtime
Not the right fit when you need:
- JavaScript rendering or hydrated DOM
- network request inspection
- clicks, typing, screenshots, or login flows
For those cases, use gologin-agent-browser instead of expecting a stateless scraping request to behave like a browser.
Install
The product and CLI are now named GoLogin Scraping API. The npm package currently remains gologin-webunlocker as a compatibility package until the npm account can create the new gologin-scraping-api package name.
npm install gologin-webunlockerInstall the CLI globally:
npm install -g gologin-webunlockerIf the command is still not found after a global install:
- use
npx gologin-webunlocker ... - or add your global npm bin directory to
PATH
Example:
export PATH="$(npm config get prefix)/bin:$PATH"Compatibility:
- The primary CLI command is
gologin-scraping-api. - The old
gologin-webunlockerCLI name is still shipped as an alias by this package. - The old
WebUnlockerclass is still exported as an alias forScrapingApi. - The old
GOLOGIN_WEBUNLOCKER_API_KEYenv var is still accepted as an alias forGOLOGIN_SCRAPING_API_KEY.
Get API Key
Use your GoLogin Scraping API key in:
apikeyrequest headerGOLOGIN_SCRAPING_API_KEYenvironment variableGOLOGIN_WEBUNLOCKER_API_KEYlegacy environment variable
CLI
After build/install, CLI command:
gologin-scraping-api <command> <url> [options]Commands:
scrape(raw HTML/text from API)text(derived from returned HTML, no JS rendering)markdown(derived from returned HTML, no JS rendering)json(derived metadata from HTML in SDK)
Options:
--api-key <key>orGOLOGIN_SCRAPING_API_KEY--base-url <url>--timeout-ms <number>--max-retries <number>--envelopeforjson, to print metadata plusoutcome,nextActionHint, and diagnostics
Examples:
gologin-scraping-api scrape https://example.com --api-key wu_live_xxx
GOLOGIN_SCRAPING_API_KEY=wu_live_xxx gologin-scraping-api text https://example.com
GOLOGIN_SCRAPING_API_KEY=wu_live_xxx gologin-scraping-api json https://example.com
GOLOGIN_SCRAPING_API_KEY=wu_live_xxx gologin-scraping-api json https://example.com --envelope
npx gologin-webunlocker text https://example.com --api-key wu_live_xxxQuick Start
import { ScrapingApi } from "gologin-webunlocker";
const client = new ScrapingApi({
apiKey: process.env.GOLOGIN_SCRAPING_API_KEY!
});
const result = await client.scrape("https://example.com");
console.log(result.status);
console.log(result.content.slice(0, 500));Backward-compatible import:
import { WebUnlocker } from "gologin-webunlocker";Constructor Options
new ScrapingApi({
apiKey: "wu_live_xxx",
baseUrl: "https://parsing.webunlocker.gologin.com",
timeoutMs: 15000,
maxRetries: 2
});apiKey: stringrequired, sent asapikeyheaderbaseUrl?: stringdefaults tohttps://parsing.webunlocker.gologin.comtimeoutMs?: numberdefaults to15000maxRetries?: numberdefaults to2
Normalized scrape() Response
/v1/scrape returns raw HTML/text from the upstream page. The SDK wraps it into a normalized object:
type ScrapeResult = {
success: true;
url: string;
content: string;
status?: number | null;
contentType?: string | null;
headers?: Record<string, string>;
};scrape() throws typed errors for non-2xx responses.
scrapeRaw() Example
Use scrapeRaw() when you need direct access to native fetch Response:
const response = await client.scrapeRaw("https://example.com");
console.log(response.status);
const html = await response.text();buildScrapeUrl() Example
const requestUrl = client.buildScrapeUrl("https://example.com");
console.log(requestUrl);
// https://parsing.webunlocker.gologin.com/v1/scrape?url=https%3A%2F%2Fexample.comSDK-Side Derived Methods
These methods are derived from the HTML returned by the API. They do not require additional backend features.
Important:
- they do not execute JavaScript
- they only see the HTML returned by the upstream request
- on JS-heavy sites they may mostly reflect the server-rendered shell rather than the final browser-visible page
scrapeText()
const result = await client.scrapeText("https://example.com");
console.log(result.text.slice(0, 500));
console.log(result.outcome);
console.log(result.nextActionHint);scrapeMarkdown()
const result = await client.scrapeMarkdown("https://example.com");
console.log(result.markdown.slice(0, 500));
console.log(result.diagnostics);scrapeJSON()
const result = await client.scrapeJSON("https://example.com");
console.log(result.data.title);
console.log(result.data.description);
console.log(result.data.links.slice(0, 5));
console.log(result.outcome);
console.log(result.outcomeReason);Derived methods also return lightweight classification fields:
outcome:ok,empty,incomplete,client_rendered_likely,authwall,challenge, orblockedoutcomeReason: short explanationnextActionHint: suggested next step such asuse_gologin_agent_browserdiagnostics: content length, script count, link count, heading count, and shell-marker detection
This is intended to tell you when the Scraping API probably hit an HTML shell or a gated page instead of complete rendered content.
batchScrape()
const results = await client.batchScrape(
["https://example.com", "https://gologin.com"],
{ concurrency: 2 }
);
console.log(results.map((r) => ({ url: r.url, status: r.status })));Typed Errors
import {
ScrapingApi,
ScrapingApiError,
AuthenticationError,
RateLimitError,
APIError,
TimeoutError,
NetworkError
} from "gologin-webunlocker";
try {
const client = new ScrapingApi({ apiKey: "wu_live_xxx" });
await client.scrape("https://example.com");
} catch (error) {
if (error instanceof AuthenticationError) {
console.error("Invalid API key");
} else if (error instanceof RateLimitError) {
console.error("Rate limited");
} else if (error instanceof TimeoutError) {
console.error("Request timed out");
} else if (error instanceof NetworkError) {
console.error("Network failure");
} else if (error instanceof APIError) {
console.error("Server/API error");
} else if (error instanceof ScrapingApiError) {
console.error("SDK error");
} else {
console.error("Unknown error", error);
}
}Error mapping:
401/403->AuthenticationError429->RateLimitError500+->APIError- abort/timeout ->
TimeoutError - fetch/network issues ->
NetworkError
Local Example
GOLOGIN_SCRAPING_API_KEY=wu_live_xxx npm run exampleDevelopment
git clone https://github.com/GologinLabs/gologin-scraping-api.git
cd gologin-scraping-api
npm install
npm run buildRelease
npm run release:check
npm publish --access publicRouting Rule Of Thumb
- Use
gologin-scraping-apiwhen the target is likely server-rendered HTML or an exposed data endpoint. - Use
gologin-agent-browserwhen useful content appears only after hydration, client-side requests, or interaction. - If
outcomecomes back asclient_rendered_likely,authwall,challenge, orblocked, treat that as a signal to escalate into a browser tool rather than retrying the same stateless extraction blindly.
