@0xkobold/pi-web
v0.2.0
Published
Web search and content extraction for pi agents — DuckDuckGo/SearX search, cascade fetching (fast → readability → Playwright), deep research
Maintainers
Readme
pi-web
Web search and content extraction for pi-coding-agent — cascade fetching, multi-engine search, deep research.
Installation
pi install npm:@0xkobold/pi-webOr as part of the meta-extension:
pi install npm:@0xkobold/pi-koboldFeatures
- Cascade Fetching — Fast HTML → Readability → Playwright for JS sites
- Multi-Engine Search — DuckDuckGo (default) + SearXNG (fallback)
- Deep Research — Search + fetch + synthesize from multiple sources
- Playwright Pool — Browser reuse with concurrency limits and retries
- Optional Dependency — Playwright is optional; fast fetch works without it
Tools
| Tool | Description |
|------|-------------|
| web_fetch | Fetch and extract content from a URL |
| web_search | Search web, optionally fetch content from results |
| web_research | Deep research: search + fetch + multi-source synthesis |
Commands
| Command | Description |
|---------|-------------|
| /deep-fetch <url> | Fetch JS-rendered content using Playwright |
| /web-search-deep <query> | Search + fetch from top results |
Parameters
web_fetch
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| url | string | required | Full URL to fetch |
| max_length | number | 5000 | Maximum characters to retrieve |
| use_playwright | boolean | false | Force Playwright for JS content |
| timeout_ms | number | 15000 | Timeout in ms (max: 60000) |
web_search
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| query | string | required | Search query |
| limit | number | 5 | Number of results (1-10) |
| fetch_content | boolean | false | Fetch full content from top results |
| fetch_sources | number | 3 | How many sources to fetch |
web_research
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| question | string | required | Research question |
| sources | number | 5 | Number of sources to analyze (1-10) |
Architecture
┌────────────────────────────────────────┐
│ pi-web │
├─────────────┬──────────────────────────┤
│ Search │ Content Extraction │
│ ───────── │ ────────────────────── │
│ DuckDuckGo │ 1. Fast fetch (HTML) │
│ SearXNG │ 2. Readability (regex) │
│ │ 3. Playwright (JS) │
├─────────────┴──────────────────────────┤
│ Playwright Browser Pool │
│ (concurrency: 2, pool TTL: 2m) │
└────────────────────────────────────────┘Development
cd packages/pi-web
bun install
bun run buildOptional: Playwright
For JavaScript-rendered content, install Playwright:
npm install playwright
npx playwright install chromiumWithout Playwright, web_fetch still works using fast HTML fetch and readability extraction.
