npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

job-scout

v1.6.6

Published

TypeScript-native job-scout library

Readme

job-scout

TypeScript-first job scraping library built on Crawlee for aggregating jobs from multiple job boards with a single request model and a unified runtime configuration.

Supported providers: indeed, glassdoor, techinasia, kalibrr, glints, jobstreet, bayt, dealls, karir, lokerid, remoteok, jobicy, himalayas, remotive, nodesk, workingnomads. Experimental support for linkedin, zipRecruiter, google, naukri, and bdjobs.

Install

npm install job-scout # or pnpm|yarn|bun add job-scout

Requirements:

  • Node.js >=20
  • Camoufox binaries installed if you use browser-backed providers such as experimental linkedin or glints, if you want JobStreet browser-auth fallback support, or if you want Glassdoor browser-based location resolution: npx camoufox-js fetch
  • Chromium installed if you want the runtime fallback when Camoufox cannot launch: npx playwright install chromium
  • A runtime capable of launching Camoufox or Playwright Chromium for live LinkedIn, Glints, and optional JobStreet browser-auth fallback. Some locked-down sandboxes cannot launch either browser, even with --no-sandbox.

Quick Start

import { createClient, searchLocations } from 'job-scout/server'

const [yogyakarta] = searchLocations('Yogyakarta', {
	countryIsoCode: 'ID',
	types: ['City'],
	limit: 1,
})

if (!yogyakarta) {
	throw new Error('Yogyakarta not found in location database')
}

const client = createClient({
	runtime: { requestTimeoutMs: 20_000 },
	logging: { level: 'warn' },
})

const run = await client.scout({
	providers: ['indeed'],
	query: 'software engineer',
	locationCode: yogyakarta.code,
	pagination: { limitPerProvider: 20 },
	filters: { postedWithinHours: 72 },
})

const jobs = await run.collect()
const dataset = await run.dataset()

console.log(dataset.id)
console.log(jobs.length)
console.log(jobs[0])

Crawlee Run API

client.scout() is the primary Crawlee-backed entrypoint. It starts a run and returns a handle that can:

  • collect() normalized jobs from the run dataset
  • events() stream live job and lifecycle events
  • dataset() expose the underlying dataset identity
  • stats() report emitted, deduped, completed, and failed counts
import { createClient, searchLocations } from 'job-scout/server'

const [unitedStates] = searchLocations('United States', {
	types: ['Country'],
	limit: 1,
})

const client = createClient()

const run = await client.scout({
	providers: ['indeed'],
	query: 'backend engineer',
	locationCode: unitedStates!.code,
})

for await (const event of run.events()) {
	if (event.type === 'job') {
		console.log(event.job.title)
	}
}

API Surface

import {
	EXPERIMENTAL_JOB_PROVIDERS,
	STABLE_JOB_PROVIDERS,
	Provider,
} from 'job-scout'

import {
	createClient,
	getLocation,
	getManyLocations,
	searchLocations,
} from 'job-scout/server'

Browser-safe top-level exports:

  • Provider (canonical scraper provider constants)
  • allJobProviders (runtime list of all public provider IDs)
  • STABLE_JOB_PROVIDERS, EXPERIMENTAL_JOB_PROVIDERS, ALL_JOB_PROVIDERS
  • browser-safe value helpers, constants, error classes, and types

Server-only exports:

  • createClient(config?)
  • searchLocations(query, options?)
  • getLocation({ code })
  • getManyLocations(request)

Configured client methods:

  • client.searchLocations(query, options?)
  • client.getLocation({ code })
  • client.getManyLocations(request)
  • client.getBrowserAuthStatus(request)
  • client.bootstrapBrowserAuth(request)
  • client.scout(request)
  • client.streamScout(request, options?)

Location Lookup

locationCode should come from the packaged SQLite location database. Resolve it with searchLocations() before calling the job APIs. The library ships the database as a read-only server-side asset; callers do not need to manage a separate database file.

import {
	getLocation,
	getManyLocations,
	searchLocations,
} from 'job-scout/server'

const matches = searchLocations('South Jakarta', {
	countryIsoCode: 'ID',
	types: ['City'],
	limit: 5,
})

const selected = matches[0]
if (!selected) {
	throw new Error('Location not found')
}

const location = getLocation({ code: selected.code })
const siblings = getManyLocations({
	parentCode: selected.parentCode,
	limit: 10,
})

console.log(selected.code)
console.log(location)
console.log(siblings)
type LocationSearchOptions = {
	countryIsoCode?: string // Narrow matches to a specific country ISO code.
	parentCode?: string | null // Scope search under one parent location.
	types?: LocationType[] // Restrict results to location kinds such as 'City' or 'Country'.
	limit?: number // Cap the number of returned matches.
}

type GetManyLocationsRequest =
	| {
			codes: string[] // Resolve many exact location codes in one call.
			countryIsoCode?: string
			types?: LocationType[]
			limit?: number
	  }
	| {
			parentCode?: string | null // Use null to browse top-level records.
			countryIsoCode?: string
			types?: LocationType[]
			limit?: number
	  }

type GetLocationRequest = {
	code: string // Resolve one exact location code.
}

type LocationRecord = {
	code: string // Canonical ID used by job search requests.
	name: string // Base location name.
	type: LocationType // Exported union of supported location kinds.
	parentCode: string | null // Parent location when the record is nested.
	countryIsoCode: string | null // Canonical ISO country code.
	display: string // Human-readable display label.
	hasChildren: boolean // Whether getManyLocations({ parentCode: code }) can drill into nested results.
}

Country lookups normalize sovereign-like top-level records as Country results. This includes dataset-backed records such as Taiwan and Hong Kong, plus generated top-level country roots for datasets that only contain subregions, such as China.

Client API

Use createClient() when you want to reuse the same config across many requests.

import { createClient } from 'job-scout/server'

const client = createClient({
	runtime: { requestTimeoutMs: 20_000 },
	logging: { level: 'warn' },
})

const [austin] = client.searchLocations('Austin', {
	countryIsoCode: 'US',
	types: ['City'],
	limit: 1,
})

if (!austin) {
	throw new Error('Austin not found in location database')
}

const run = await client.scout({
	providers: ['indeed'],
	query: 'backend engineer',
	locationCode: austin.code,
})
const jobs = await run.collect()

Use the standalone location functions when you do not need shared config. Use createClient() for any operation that depends on runtime config, including client.scout(request), client.streamScout(request, options?), client.getBrowserAuthStatus(request), and client.bootstrapBrowserAuth(request).

Custom Logging

Job Scout uses a small logger interface instead of requiring a logging framework. By default, library logs go to console, and logging.level controls which messages are emitted.

import type { Logger } from 'job-scout'
import { createClient } from 'job-scout/server'

const logger: Logger = {
	error(message, ...args) {
		console.error('[job-scout:error]', message, ...args)
	},
	warn(message, ...args) {
		console.warn('[job-scout:warn]', message, ...args)
	},
	info(message, ...args) {
		console.info('[job-scout:info]', message, ...args)
	},
	debug(message, ...args) {
		console.debug('[job-scout:debug]', message, ...args)
	},
}

const client = createClient({
	logging: {
		level: 'info',
		logger,
	},
})

Injected loggers receive formatted messages with component names such as JobScout:engine and JobScout:provider:linkedin, so you can route or reformat library logs without adopting a specific logging package.

Request Model

Use the request shape below as the main reference for client.scout() and client.streamScout().

type JobSearchRequest = {
	providers: JobProvider[] // Required, non-empty. Stable and experimental provider IDs are exported unions.
	query?: string // Search keywords used by most providers.
	locationCode?: string // Canonical location code from searchLocations().
	pagination?: {
		limitPerProvider?: number // Max jobs fetched from each provider. Default: 15.
		offset?: number // Provider-specific offset where supported. Default: 0.
	}
	filters?: {
		distanceMiles?: number // Default: 50.
		remote?: boolean // Default: false.
		easyApply?: boolean
		employmentType?: EmploymentType
		postedWithinHours?: number
	}
	linkedin?: {
		companyIds?: number[] // Limit LinkedIn results to specific company IDs.
	}
}

Provider Rules

type StableJobProvider =
	| 'indeed'
	| 'techinasia'
	| 'kalibrr'
	| 'glints'
	| 'jobstreet'
	| 'bayt'
	| 'dealls'
	| 'karir'
	| 'lokerid'
	| 'remoteok'
	| 'jobicy'
	| 'himalayas'
	| 'remotive'
	| 'nodesk'
	| 'workingnomads'

type ExperimentalJobProvider =
	| 'linkedin'
	| 'zipRecruiter'
	| 'glassdoor'
	| 'google'
	| 'naukri'
	| 'bdjobs'

Runtime constants are exported too, so you can avoid hardcoded provider strings:

import {
	EXPERIMENTAL_JOB_PROVIDERS,
	STABLE_JOB_PROVIDERS,
	Provider,
} from 'job-scout'

const stableOnly = [...STABLE_JOB_PROVIDERS]
const experimentalOnly = [...EXPERIMENTAL_JOB_PROVIDERS]

console.log(Provider.ZIP_RECRUITER) // "zip_recruiter" (internal scraper ID)
  • Experimental providers still require explicit config opt-in before use.
  • locationCode is also used for provider market selection and unsupported regions are skipped during request compilation.
  • Region-locked providers currently include dealls (Indonesia), glints (Indonesia, Singapore, Vietnam, Malaysia, Taiwan, Philippines, China, and Hong Kong), jobstreet (Malaysia, Singapore, Philippines, and Indonesia), kalibrr (Indonesia and Philippines), lokerid (Indonesia), naukri (India), and bdjobs (Bangladesh).
  • jobstreet and kalibrr require locationCode so the provider can select a supported country domain/market.
  • Glints falls back to the selected country's default all-locations search when a market-specific location label cannot be resolved cleanly.
  • JobStreet uses GraphQL search/detail with browser fallback when GraphQL auth/session/contract failures block search. It still requires locationCode for market selection, applies distanceMiles as unsupported, and verifies easyApply from the detail response.
  • Kalibrr uses public HTTP JSON endpoints, applies explicit country filters for Indonesia/Philippines, and maps remote plus supported employment-type filters natively while verifying postedWithinHours and easyApply client-side.
  • Lokerid uses same-origin Remix data endpoints on www.loker.id, supports keyword search in the first release, and ignores unsupported shared filters instead of mapping them natively.
  • Jobicy uses Jobicy's public JSON remote-jobs feed, always normalizes results as remote roles, maps query to upstream tag, applies postedWithinHours client-side, and only sends broad geo filters when locationCode can be reduced cleanly to a country or coarse region.
  • Himalayas uses Himalayas' public JSON jobs API, always normalizes results as remote roles, maps query, locationCode country, and supported employment types upstream, includes worldwide-friendly jobs alongside country-matched jobs, and applies postedWithinHours client-side.
  • Remotive uses Remotive's public JSON remote-jobs API, always normalizes results as remote roles, maps query upstream to search, applies locationCode, employmentType, and postedWithinHours client-side, and keeps worldwide or region-compatible restrictions when country filtering is requested.
  • Nodesk uses Nodesk's public Algolia-backed remote job index with detail-page enrichment, maps query upstream, applies best-effort internal remote-region mapping from locationCode, keeps worldwide or compatible region buckets when country filtering is requested, and falls back to the no-JS HTML list when Algolia is unavailable.

Filter Constraints

Some providers reject incompatible filter combinations. The library enforces those combinations in TypeScript and at runtime.

  • Indeed supports only one filter group at a time: postedWithinHours, or easyApply, or employmentType/remote
  • LinkedIn cannot combine postedWithinHours with easyApply

Example of a valid request for JobStreet:

const request = {
	providers: ['jobstreet'],
	query: 'software engineer',
	locationCode: 'MY-14-KUL',
} satisfies JobSearchRequest

Configuration Model

type JobScoutConfig = {
	enrichment?: 'normal' | 'high' | 'veryHigh' // Enrichment effort level for all providers. Default: 'normal'. Higher levels can trigger extra shared/company-website enrichment steps. `veryHigh` can also fall back to authenticated LinkedIn company profile scraping when LinkedIn is explicitly enabled and a company LinkedIn URL is known but the company website is still missing.
	runtime?: {
		requestTimeoutMs?: number // Default: 20_000 ms.
		providerFailureMode?: 'throw' | 'swallow' // Default: 'throw'. Controls whether per-provider scraper failures make `run.collect()` throw or return only successful-provider results.
		storage?:
			| boolean // `true` => persistent storage in the default `.job-scout-storage` directory. `false` => ephemeral.
			| string // Persistent storage rooted at this directory.
			| {
					mode?: 'ephemeral' | 'persistent' // Default: 'ephemeral'.
					directory?: string // Optional persistent storage directory. Defaults to `.job-scout-storage`.
			  }
		proxy?: {
			urls?: string[] // Rotating proxy list used by Crawlee proxy configuration. Default: []. Example: ['http://user:pass@proxyserver:port'].
		}
		browser?: {
			userAgent?: string // Shared user agent override for HTTP and browser-backed crawls.
			headless?: boolean // Browser mode for Playwright-backed scraping. Default: true.
		}
		browserAuth?: {
			profiles?: Record<
				string,
				| {
						provider: 'jobstreet'
						market: 'MY' | 'SG' | 'PH' | 'ID'
				  }
				| {
						provider: 'linkedin'
				  }
			>
			jobstreet?: Partial<Record<'MY' | 'SG' | 'PH' | 'ID', string>> // Shorthand provider mapping.
			linkedin?: string // Shorthand provider mapping.
			providerProfiles?: {
				jobstreet?: Partial<Record<'MY' | 'SG' | 'PH' | 'ID', string>>
				linkedin?: string
			}
			// Default: no profiles and no provider mappings (browser auth disabled).
		}
		jobstreetSession?: Partial<
			Record<
				'MY' | 'SG' | 'PH' | 'ID',
				{
					cookies:
						| string
						| Array<{
								name: string
								value: string
						  }>
					solId?: string
					sessionId?: string
					visitorId?: string
					userQueryId?: string
					providerContext?: string
					include?: string[]
					queryHints?: string[]
					relatedSearchesCount?: number
				}
			>
		>
		// Optional programmatic JobStreet GraphQL session bundles. Default: disabled.
		concurrency?:
			| number // Shorthand: applies the same global limit to both HTTP and browser work.
			| {
					providers?: number // Max providers to run at once. Default: all requested providers in parallel.
					http?:
						| number // Global cap for HTTP/Crawlee request work. Default: 24.
						| {
								global?: number // Global cap for HTTP/Crawlee request work.
								perProvider?: Partial<
									Record<JobProvider, number> // Provider-specific HTTP request caps.
								>
						  }
					browser?:
						| number // Global cap for Playwright/browser tasks across scraping and browser-assisted enrichment. Default: 2.
						| {
								global?: number // Global cap for Playwright/browser tasks.
								perProvider?: Partial<
									Record<JobProvider, number> // Provider-specific browser task caps.
								>
						  }
			  }
		retry?:
			| false // Disable list/detail retries.
			| number // Shorthand: applies the same retry budget to list and detail pages.
			| {
					list?: number // Shorthand for listPages.
					detail?: number // Shorthand for detailPages.
					backoff?: {
						baseMs?: number // Shorthand for baseDelayMs.
						maxMs?: number // Shorthand for maxDelayMs.
					}
					listPages?: number // Default: 2.
					detailPages?: number // Default: 1.
					baseDelayMs?: number // Default: 250.
					maxDelayMs?: number // Default: 3000.
			  }
		sessions?: {
			enabled?: boolean // Default: true.
			persistCookies?: boolean // Default: true.
			maxPoolSize?: number // Default: 50.
			maxUsageCount?: number // Default: 25.
			maxAgeSecs?: number // Default: 1800.
		}
		advanced?: {
			maxRequestsPerMinute?: number // Unlimited unless set.
		}
	}
	experimental?:
		{
			sites?: ExperimentalJobProvider[] // Preferred shorthand list of enabled experimental providers.
			experimentalSites?: Partial<Record<ExperimentalJobProvider, boolean>> // Missing keys default to false.
		}
	output?: {
		descriptionFormat?: 'markdown' | 'html' | 'plain' // Default: 'markdown'. `plain` preserves readable paragraphs and list bullets.
		annualizeSalary?: boolean // Default: false.
		salaryFallback?: 'usOnly' // Default: 'usOnly'.
	}
	logging?:
		| 'error' | 'warn' | 'info' | 'debug' // Shorthand log level.
		| {
				level?: 'error' | 'warn' | 'info' | 'debug' // Default: 'error'.
				logger?: Logger // Optional sink for Job Scout log messages. Defaults to console.
		  }
}

Defaults:

  • runtime.storage = true is shorthand for persistent storage in the default .job-scout-storage directory.
  • runtime.storage = '.job-scout-storage' is shorthand for persistent storage in that directory.
  • runtime.storage.mode defaults to ephemeral, so library calls do not persist Crawlee storage unless you opt in.
  • runtime.browser.headless defaults to true.
  • runtime.providerFailureMode defaults to throw.
  • runtime.browserAuth is opt-in and disabled by default.
  • runtime.jobstreetSession is opt-in and disabled by default.
  • runtime.sessions.enabled defaults to true.
  • runtime.requestTimeoutMs defaults to 20_000.
  • runtime.retry = false disables list/detail retries.
  • runtime.retry = 2 applies the same retry budget to list and detail pages.
  • runtime.concurrency = 1 is shorthand for setting both HTTP and browser global concurrency to 1.
  • runtime.concurrency.providers defaults to all requested providers running in parallel.
  • runtime.concurrency.http = 24 is shorthand for setting the HTTP global concurrency cap to 24.
  • runtime.concurrency.browser = 2 is shorthand for setting the browser global concurrency cap to 2.
  • runtime.concurrency.http.global defaults to 24.
  • runtime.concurrency.browser.global defaults to 2.
  • Base per-provider concurrency defaults are 5 for browser-backed providers (linkedin, google, glints, jobstreet) and 24 for non-browser providers. JobStreet keeps the browser-backed default because it can still fall back to the browser scraper when GraphQL is unavailable.
  • runtime.concurrency.http.perProvider[provider] overrides the HTTP limit for that provider.
  • runtime.concurrency.browser.perProvider[provider] overrides the browser limit for that provider.
  • Per-scope concurrency precedence is: scoped provider override, then scraper-specific override (if a scraper defines one), then the runtime base default above.
  • experimental: { sites: ['linkedin', 'google'] } is the shorthand for enabling experimental providers.
  • logging: 'info' is shorthand for { logging: { level: 'info' } }.

Example enabling an experimental provider:

const config = {
	experimental: {
		experimentalSites: {
			linkedin: true,
			google: true,
		},
	},
} satisfies JobScoutConfig

Browser Auth

LinkedIn is experimental, browser-auth-only, and can reuse manually bootstrapped browser logins. JobStreet can either use a programmatic GraphQL session bundle or reuse a manually bootstrapped browser login. Glassdoor uses normal HTTP scraping and may use the browser only to resolve locations. JobStreet auth is market-scoped for MY, SG, PH, and ID; LinkedIn auth is provider-scoped.

Constraints:

  • Browser auth requires runtime.storage.mode = 'persistent'.
  • Browser auth supports zero proxies or one fixed proxy URL only. Rotating proxy pools are rejected when auth is configured.
  • client.getBrowserAuthStatus() live-validates the saved profile by opening the auth provider in a browser context seeded from the stored state.
  • client.bootstrapBrowserAuth() launches a headed browser and expects you to complete the LinkedIn or SEEK/JobStreet sign-in flow yourself, unless skipIfReady: true is set and the saved auth state already validates successfully.
  • LinkedIn browser auth is fail-fast. If the saved login is missing or invalid, the provider raises an error instead of silently falling back to anonymous mode.
  • Set runtime.providerFailureMode = 'swallow' if you want run.collect() to return only successful-provider results when a provider such as LinkedIn fails.
  • JobStreet prefers runtime.jobstreetSession for GraphQL search. If GraphQL search is blocked by auth/session requirements or request/contract failures, the provider falls back to the browser scraper. A JobStreet browser profile is still optional and is only used when you want to seed that browser session with a saved login.

Example programmatic JobStreet session bundle:

const config = {
	runtime: {
		jobstreetSession: {
			ID: {
				cookies: [
					{ name: 'sol_id', value: 'visitor-id' },
					{ name: 'JobseekerSessionId', value: 'session-id' },
					{ name: 'JobseekerVisitorId', value: 'session-id' },
				],
				solId: 'visitor-id',
				sessionId: 'session-id',
				visitorId: 'visitor-id',
			},
		},
	},
} satisfies JobScoutConfig

Example:

import { createClient } from 'job-scout/server'

const config = {
	runtime: {
		storage: {
			mode: 'persistent',
			directory: '.job-scout-storage',
		},
		browserAuth: {
			profiles: {
				'jobstreet-my-main': {
					provider: 'jobstreet',
					market: 'MY',
				},
			},
			providerProfiles: {
				jobstreet: {
					MY: 'jobstreet-my-main',
				},
			},
		},
	},
} satisfies JobScoutConfig

const client = createClient(config)

const authStatus = await client.getBrowserAuthStatus({
	provider: 'jobstreet',
	market: 'MY',
})

await client.bootstrapBrowserAuth({
	provider: 'jobstreet',
	market: 'MY',
	skipIfReady: true,
})

const run = await client.scout({
	providers: ['jobstreet'],
	query: 'software engineer',
	locationCode: 'MY-14-KUL',
})
const jobs = await run.collect()

You can also call the same flow through a configured client:

import { createClient } from 'job-scout/server'

const client = createClient(config)
await client.bootstrapBrowserAuth({
	provider: 'jobstreet',
	market: 'MY',
	skipIfReady: true,
})

client.getBrowserAuthStatus() returns a result shaped like:

type BrowserAuthStatusResult = {
	provider: 'jobstreet' | 'linkedin'
	profile: string
	market?: JobStreetAuthMarket
	status: 'ready' | 'missing' | 'needsBootstrap'
	exists: boolean
	usable: boolean
	storageStatePath: string | null
	checkedAt: Date
	reason?: 'missing' | 'invalidated' | 'mismatch' | 'unauthenticated'
}

client.bootstrapBrowserAuth() returns:

type BrowserAuthBootstrapResult = {
	provider: 'jobstreet' | 'linkedin'
	profile: string
	market?: JobStreetAuthMarket
	storageStatePath: string
	authenticatedAt: Date
	reusedExisting: boolean
}

LinkedIn is experimental and browser-auth-only. By default, when providers: ['linkedin'] is requested without experimental: { sites: ['linkedin'] } (or experimental.experimentalSites.linkedin = true) and a configured authenticated LinkedIn profile, run.collect() throws a browser-auth-required provider error. The same LinkedIn opt-in also enables the veryHigh company-profile enrichment fallback. Set runtime.providerFailureMode = 'swallow' to keep the run result and return [] for the failed LinkedIn portion instead.

The same LinkedIn browser auth profile is also reused by shared enrichment at enrichment: 'veryHigh' for non-LinkedIn jobs when:

  • company.linkedInUrl is already known
  • company.websiteUrl is still missing

That fallback visits the LinkedIn company /about/ page, fills only missing company fields, and if it discovers a website URL the existing company website enrichment can continue in the same run.

runtime.browserAuth.jobstreet maps the JobStreet market used at scrape time to the named profile that should seed the browser session. runtime.browserAuth.linkedin maps LinkedIn scraping to a named LinkedIn profile. The older runtime.browserAuth.providerProfiles.* form is also accepted. client.bootstrapBrowserAuth() can also target a specific profile directly with profile: 'jobstreet-my-main' or profile: 'linkedin-main'.

Result Model

client.scout() returns a ScoutRun. Call collect() when you want the materialized Job[]. In the default runtime.providerFailureMode = 'throw', collect() throws on the first failed provider. In 'swallow', it returns jobs from successful providers and omits failed ones.

const client = createClient(config)
const run = await client.scout(request)
const jobs = await run.collect()
const stats = await run.stats()

run.stats() returns aggregate counters plus per-provider summaries. In runtime.providerFailureMode = 'swallow', use providerSummaries to distinguish failed providers from successful providers that returned zero jobs.

type ScoutRunStats = {
	status: 'running' | 'completed' | 'failed'
	emittedTotal: number
	skippedByDedupeTotal: number
	providerCount: number
	completedProviders: number
	failedProviders: number
	providerSummaries: Array<{
		provider: JobProvider
		status: 'pending' | 'running' | 'succeeded' | 'failed'
		emitted: number
		skippedByDedupe: number
		errorMessage?: string
	}>
}

Collected jobs are unified Job[] records from providers without an extra domain remapping step.

type Job = {
	id?: string | null
	provider?: Provider | null
	title: string
	jobUrl: string
	jobUrlDirect?: string | null
	location?: Location | null
	description?: string | null
	jobType?: JobType | null
	company: {
		name?: string | null
		providerUrl?: string | null
		websiteUrl?: string | null
		phones?: string[] | null
		linkedInUrl?: string | null
		employeeLinkedInUrls?: string[] | null // Public employee/recruiter LinkedIn profile URLs found on the company website.
		socialUrls?: string[] | null
		careersUrl?: string | null
		industry?: string | null
		addresses?: string | null
		numEmployees?:
			| '1-10'
			| '11-50'
			| '51-200'
			| '201-500'
			| '501-1000'
			| '1001-5000'
			| '5001-10000'
			| '10000+'
			| null
		revenue?: string | null
		foundedYear?: number | null
		hqCountryIsoCode?: string | null
		description?: string | null
		logo?: string | null
		rating?: number | null
		reviewsCount?: number | null
	}
	salary?: {
		interval?: 'yearly' | 'monthly' | 'weekly' | 'daily' | 'hourly' | null
		minAmount?: number | null
		maxAmount?: number | null
		currency?: string | null
	} | null
	postedAt?: Date | null
	expiresAt?: Date | null
	recruiters?: Array<{
		name?: string
		email?: string | null
	}>
	additionalEmails?: string[] | null // Emails found in the posting after recruiter-like addresses are split out.
	potentialRecruiterEmails?: string[] | null // Heuristically recruiter-like emails (for example hr/careers aliases).
	workMode?: 'remote' | 'hybrid' | 'onSite' | null
	badges?: string[] // Normalized listing badges/labels such as premium employer or boosted.
	tags?: string[] // Additional non-badge provider labels such as keywords, categories, or source-specific descriptors.
	level?: string | null
	field?: string | null
	skills?: string[] | null
	benefits?: string[] | null
	experienceRange?: '0-3' | '3-5' | '5-8' | '8-12' | '12+' | null
	vacancyCount?: number | null
}

type Location = {
	country: string | null
	city: string | null
	state: string | null
	displayLocation(): string
}

Streaming API

Use the streaming APIs when you want to process results incrementally instead of waiting for a full batch.

import { createClient, searchLocations } from 'job-scout/server'

const [unitedStates] = searchLocations('United States', {
	types: ['Country'],
	limit: 1,
})

if (!unitedStates) {
	throw new Error('United States not found in location database')
}

const client = createClient()

for await (const event of client.streamScout(
	{
		providers: ['indeed'],
		query: 'backend engineer',
		locationCode: unitedStates.code,
		pagination: { limitPerProvider: 10 },
	},
	{
		includeLifecycleEvents: true,
		dedupeStrategy: 'crossProvider',
	},
)) {
	if (event.type === 'job') {
		console.log(event.provider, event.providerIndex, event.job.title)
		continue
	}

	console.log(event.type)
}
type JobStreamOptions = {
	failFast?: boolean // Stop the stream after the first provider failure. Default: false.
	includeLifecycleEvents?: boolean // Emit providerStart, providerDone, providerError, and complete. Default: false.
	dedupeStrategy?: 'batchCompatible' | 'crossProvider' // Default: 'crossProvider'.
}

type JobStreamEvent =
	| {
			type: 'job'
			provider: JobProvider
			providerIndex: number
			globalIndex: number
			job: Job
	  }
	| {
			type: 'providerStart'
			provider: JobProvider
	  }
	| {
			type: 'providerError'
			provider: JobProvider
			error: unknown
	  }
	| {
			type: 'providerDone'
			provider: JobProvider
			emitted: number
			skippedByDedupe: number
	  }
	| {
			type: 'complete'
			emittedTotal: number
			skippedByDedupeTotal: number
			providerCount: number
	  }

Streaming order is completion-order within a provider. Batch APIs still collect and normalize the full result set before returning.

Examples

Repository examples:

Those scripts import from ../src/server and are intended for local repository usage. Contributor workflow, tests, and release steps are documented in CONTRIBUTING.md.