@duckstech/flatbrowser
v1.0.1
Published
Semantic headless browser — any website becomes structured JSON ordered by z-index
Downloads
178
Maintainers
Readme
Flatbrowser
A Chrome built for automation and AI. Any website becomes structured JSON — no selectors, no scrapers, no vision models.
Flatbrowser renders pages with Chrome via CDP but returns semantic data instead of pixels. Elements are ordered by z-index (modals and toasts appear before page content), interactive vs. content is separated, and everything behind overlays is marked blocked. Works as a CLI and a programmatic library.
import { flatbrowser } from "@duckstech/flatbrowser";
const fb = flatbrowser();
const s = fb.session("login");
await s.navigate("https://app.example.com/login");
await s.action("label=Email", "fill", "[email protected]");
await s.action("label=Password", "fill", "secret");
await s.action("text=Sign in", "click");
await s.waitForText("Dashboard");
console.log(s.recording()); // every step auto-recorded
console.log(s.exportPuppeteer()); // export as a replayable script
await fb.close();Why Flatbrowser
| Most tools... | Flatbrowser... |
|---|---|
| Return pixels (need vision models) | Returns semantic JSON (LLMs read directly) |
| Require CSS selectors | Accepts text=Login, role=button, label=Email |
| Fight async onclick handlers | Uses CDP input events — no page.evaluate() hangs |
| Force you to pass Page everywhere | Session encapsulates everything |
| Manual recording | Auto-records every operation |
| Ignore z-index | First-class overlay and stacking context handling |
Installation
npm install @duckstech/flatbrowser
# or: bun add @duckstech/flatbrowser / pnpm add @duckstech/flatbrowser / yarn add @duckstech/flatbrowserRequires Node.js 20+ and Chrome or Chromium installed. Flatbrowser auto-detects Chrome on Linux, macOS, and Windows. Override with CHROME_PATH if needed.
# If Chrome is not in a standard location
export CHROME_PATH=/path/to/chrome
# Or install Chrome programmatically
npx @puppeteer/browsers install chrome@stableCLI
# Extract any page as structured JSON
npx flatbrowser navigate https://example.com
# Human-readable for terminal or LLM input
npx flatbrowser navigate https://example.com --format text
# Only interactive elements (compact)
npx flatbrowser navigate https://example.com --compact
# Interactive shell with persistent session
npx flatbrowser shell12 commands: navigate, action, read, screenshot, find, evaluate, console, network, storage, wait, session, shell. See CLI.md for the full reference.
Library
Quick start
import { flatbrowser } from "@duckstech/flatbrowser";
const fb = flatbrowser();
// One-shot: uses the "default" session
const page = await fb.navigate("https://example.com");
console.log(page.title, page.layers);
await fb.close();Session-based workflow
const fb = flatbrowser();
const s = fb.session("checkout");
await s.navigate("https://shop.example.com/cart");
await s.action("text=Add to cart", "click");
await s.action("text=Checkout", "click");
await s.waitForSelector("form[name=payment]");
// Everything auto-recorded — no store.record() calls
const script = s.exportPuppeteer();
fs.writeFileSync("checkout-flow.ts", script);
await fb.close();Multi-session (parallel)
Each session has its own page, console log, network log, and recording. Safe for Promise.all.
const fb = flatbrowser();
const buyer = fb.session("buyer");
const seller = fb.session("seller");
await Promise.all([
buyer.navigate("https://marketplace.com/buy"),
seller.navigate("https://marketplace.com/sell"),
]);
await fb.close();Feeding pages to an LLM
import { flatbrowser, formatPageForAI } from "@duckstech/flatbrowser";
const fb = flatbrowser();
const s = fb.session();
const page = await s.navigate("https://news.ycombinator.com");
const prompt = formatPageForAI(page, { compact: true });
// Send `prompt` to Claude, GPT, Gemini — they see the page as text
// e.g. "[a] selector=... 'Show HN: ...' → /item?id=..."
await fb.close();Recording, export, and replay
const fb = flatbrowser();
const s = fb.session("record-me");
await s.navigate("https://site.com");
await s.action("text=Next", "click");
// Persist the recording
fs.writeFileSync("flow.json", s.exportJson());
fs.writeFileSync("flow.ts", s.exportPuppeteer()); // standalone Puppeteer script
// Later — replay in a fresh session
const replayer = fb.session("replay");
replayer.loadRecording("flow.json");
await replayer.replay();
await fb.close();Semantic targeting
No CSS selectors required:
| Prefix | Finds by | Example |
|---|---|---|
| text= | Visible text | text=Sign in |
| role= | ARIA role + optional name | role=button[Submit] |
| label= | Associated <label> | label=Email |
| placeholder= | Placeholder attr | placeholder=Search |
| title= | title attr | title=Close |
| (no prefix) | Treated as CSS selector | #main > .item |
FlatSession API
All operations are auto-recorded and return the session's new page state where applicable.
s.navigate(url, options?) // → FlatPage
s.read() // → FlatPage
s.action(target, action, value?) // → FlatPage
s.actions(steps[]) // → FlatPage (batch)
s.find(target) // → { found, selector?, tag?, text? }
s.waitForSelector(sel, { timeout })
s.waitForHidden(sel, { timeout })
s.waitForText(text, { timeout })
s.waitForNetwork({ timeout })
s.evaluate(expression) // → unknown (JS in page)
s.screenshot({ selector?, fullPage? }) // → Buffer
s.storage({ type, action, key?, value? }) // cookies / localStorage / sessionStorage
s.emulate({ locale?, timezone?, geolocation?, touchEnabled? })
s.console(options?) // → ConsoleEntry[]
s.network(options?) // → NetworkEntry[]
s.responseBody(requestId) // → { body, base64Encoded } | null
s.recording() // → RecordedStep[]
s.exportJson() // → string
s.exportPuppeteer() // → string
s.loadRecording(filePath) // → number (steps loaded)
s.replay() // → FlatPage
s.close() // close this session's page
s.page() // → Puppeteer Page (escape hatch)
s.cdpSession() // → CDPSession (escape hatch)Low-level API
For cases the session API doesn't cover, the standalone functions are still exported:
import {
newPage,
extractPage,
performAction,
findElement,
resolveTarget,
getCDPSession,
captureScreenshot,
setEmulation,
} from "@duckstech/flatbrowser";The FlatPage format
{
url: string,
title: string,
layers: [
{
zIndex: number,
isOverlay: boolean, // true if this layer covers >50% of viewport
content: FlatElement[], // h1, p, img, etc.
interactive: FlatElement[], // a, button, input, etc.
},
// ... sorted by zIndex, highest first
],
forms: [
{ id?, action?, method?, fields: FlatElement[] },
],
timestamp: string,
}Each FlatElement has tag, role?, text?, href?, rect, zIndex, blocked, selector (unique CSS), plus tag-specific fields (value, checked, placeholder, etc.).
How it works
- Chrome renders the page (via CDP, not Puppeteer abstractions).
- Flatbrowser injects a vanilla JS extraction script in an isolated world — traverses the DOM, computes effective z-index per element (respects
position,opacity,transform,will-change), generates unique selectors, and groups into layers. - Elements beneath overlays (>50% viewport) are marked
blocked. - Actions (click, fill, hover) dispatch trusted input events via CDP — avoids the
page.evaluate()deadlock whenonclickhandlers callfetch. - If the main context is corrupted, a
DOMSnapshotfallback runs with no JS execution.
License
MIT
