@ridvnv/ikea-scraper
v0.1.0
Published
A clean, typed SDK for searching IKEA and reading product details. Zod-validated data, typed errors, HTTP-first with optional headless-browser fallback.
Maintainers
Readme
ikea-scraper
A clean, typed TypeScript SDK for searching IKEA and reading product details.
- Zod-validated — every value crossing the network boundary is validated at runtime, so a
Productyou receive is guaranteed to match its type. - Typed errors — failures are
IkeaScraperErrorsubclasses (NetworkError,BlockedError,NotFoundError,ParseError,ValidationError, …) you can branch on. - HTTP-first — uses a plain
fetchand only falls back to a headless browser when needed. Puppeteer is an optional peer dependency. - Resilient — retries with exponential backoff + jitter, per-request timeouts, concurrency limiting, and pluggable caching.
- Comprehensive data — pulls product details from schema.org JSON-LD,
__NEXT_DATA__, and Open Graph meta tags.
Install
npm install @ridvnv/ikea-scraper
# optional: only needed for the headless-browser fallback
npm install puppeteerRequires Node.js 18+ (uses the global fetch). Ships ESM and CommonJS builds with type declarations.
Quick start
import { IkeaScraper } from "@ridvnv/ikea-scraper";
const ikea = new IkeaScraper({ locale: "nl/nl" });
try {
const results = await ikea.search("sofa", { limit: 5 });
const product = await ikea.getProduct(results[0]!.url);
console.log(product.name, product.price, product.dimensions);
} finally {
await ikea.close();
}API
new IkeaScraper(options?)
See Configuration for all options.
search(query, options?) => Promise<SearchResult[]>
Search IKEA and get ranked listing results. options: { locale?, limit? }.
searchProducts(query, options?) => Promise<Product[]>
Search and resolve full product details for each result (failures are skipped). options: { locale?, limit? }.
getProduct(url) => Promise<Product>
Fetch and normalise a single product page.
getProducts(urls, options?) => Promise<Product[]>
Fetch several products concurrently. With { ignoreErrors: true }, individual failures are skipped instead of throwing.
buildRoomSet(room) => Promise<RoomSet>
Build a curated set of products grouped by category for a known room ("livingRoom", "bedroom", "office").
const livingRoom = await ikea.buildRoomSet("livingRoom");
// { room: "livingRoom", groups: { sofa: [...], coffeeTable: [...], tvStand: [...] } }clearCache() => Promise<void>
Empty the cache.
close() => Promise<void>
Release resources, shutting down the headless browser if one was launched.
The Product shape
type Product = {
id: string;
name: string;
typeName?: string;
fullName?: string;
url: string;
description?: string;
brand?: string;
source: "ikea";
price?: { amount: number; currency: string; previousAmount?: number; unitText?: string };
images: { main: string; thumbnails: string[]; gallery: string[] };
dimensions?: { width?: number; height?: number; depth?: number; unit: "cm" | "mm" };
weight?: { value: number; unit: string };
color?: string[];
materials?: string[];
categories?: string[];
rating?: { value: number; count: number };
reviews?: { author?: string; rating?: number; title?: string; body?: string; date?: string }[];
availability?: { text?: string; inStock?: boolean };
designer?: string;
gtin?: string;
meta?: { availability?: string; rawTitle?: string; category?: string };
raw?: unknown; // present only when `keepRaw` is enabled
};Handling failures
import { BlockedError, NotFoundError } from "@ridvnv/ikea-scraper";
try {
await ikea.getProduct(url);
} catch (err) {
if (err instanceof NotFoundError) {
// product is gone
} else if (err instanceof BlockedError) {
// back off / rotate proxy
} else {
throw err;
}
}Configuration
| Option | Default | Description |
| --- | --- | --- |
| locale | "nl/nl" | IKEA locale, country/lang. |
| baseUrl | "https://www.ikea.com" | Base URL. |
| timeoutMs | 15000 | Per-request timeout. |
| retries | 2 | Retries for transient failures. |
| concurrency | 2 | Max simultaneous page loads. |
| cacheTtlMs | 3600000 | Cache lifetime (1h). |
| useBrowserFallback | true | Escalate to puppeteer when HTTP can't render the page. |
| politenessDelayMs | 500 | Delay before each request. Set lower at your own risk — 0 can trigger rate-limiting/blocks. |
| keepRaw | false | Attach raw JSON-LD + meta to each product under raw. |
| userAgent | Linux Chrome UA | Request User-Agent. |
| cache | MemoryCache | Custom Cache backend (e.g. Redis). |
| logger | silent | Custom Logger. |
| htmlFetcher | — | Override raw HTML fetching (proxy/tests). |
Custom cache
import { IkeaScraper, type Cache } from "@ridvnv/ikea-scraper";
class RedisCache<V> implements Cache<V> {
/* get / set / delete / clear */
}
const ikea = new IkeaScraper({ cache: new RedisCache() });Logging
By default the SDK is silent. Pass a logger to surface retries, browser fallbacks, and warnings:
import { IkeaScraper, makeConsoleLogger, consoleLogger } from "@ridvnv/ikea-scraper";
// Level-filtered: only warn + error reach the console (good for production).
const ikea = new IkeaScraper({ logger: makeConsoleLogger("warn") });
// Or the unfiltered console logger (everything, including debug):
const verbose = new IkeaScraper({ logger: consoleLogger });Implement the Logger interface (debug/info/warn/error) to forward into your own structured logger (pino, winston, etc.).
Development
npm test # fast, offline, fixture-based unit tests
npm run test:integration # live tests against ikea.com (needs network + puppeteer)
npm run typecheck
npm run build # bundles ESM + CJS + .d.ts into dist/ via tsupReliability
A scraper's reliability is bounded by the site it scrapes — IKEA can change markup, rate-limit, or block at any time. This SDK maximises what's controllable: runtime validation, typed errors, retries/backoff, and caching. It is pure TypeScript + Zod because the failure modes are external, and Zod is what closes TypeScript's runtime-validation gap.
Use responsibly and in line with IKEA's terms of service.
License
MIT
