pi-free-web-search
v0.3.0
Published
Free, hybrid, browser-aware web search and content extraction package for Pi coding agent
Maintainers
Readme
pi-free-web-search
Free, browser-aware web search and readable content extraction for Pi coding agent, without paid APIs.
Why this package exists
pi-web-access is excellent, but its search path depends on Perplexity/Gemini. pi-free-web-search is for teams that want:
- zero paid APIs
- browser-aware behavior for automation, while defaulting searches to Yahoo and failing over across engines when needed
- HTTP-first performance with browser fallback only when quality requires it
- a package that feels native in Pi (tools, commands, status line, TUI rendering)
What it provides
| Capability | Name | Description |
|---|---|---|
| Tool | free_web_search | Natural-language web search with HTTP-first and browser fallback pipeline |
| Tool | free_fetch_content | Readable content extraction from a URL with browser fallback for JS-heavy pages |
| Command | /free-search-info | Shows detected browser, engine, mode, and executable |
| Command | /free-search-test <query> | End-to-end smoke test from inside Pi |
| Command | /free-search-debug <query> | Runs a real search and shows detailed debug logs/attempt metadata |
| Command | /free-search-status | Shows recent per-engine health, latency, failures, and cooldown state for the current session |
| Prompt | /pi-search <topic> | Short research template that steers the current session/model to use free_web_search and free_fetch_content |
| Skill | free-web-researcher | Guidance for robust research flow with these tools |
Quick start
1) Install dependencies
bun install2) Run checks
bun run check
bun run smoke3) Install into Pi
pi install /absolute/path/to/pi-free-web-search4) Use the prompt shortcut
/pi-search exact Bun documentation for test reporters
/pi-search study the Playwright locator docs and explain best practicesHow the search pipeline works
- Detect browser context for automation.
- Choose the configured search engine, or Yahoo by default.
- Build search URL for the active engine.
- Run HTTP search first.
- Re-rank and quality-check results.
- Escalate to browser automation only if needed and allowed.
- Merge/dedupe/rerank final results.
- Optionally fetch top-result content with readable extraction.
Supported targets
Operating systems
- macOS
- Linux
Browsers / families
- Safari
- Chrome
- Brave
- Edge
- Chromium
- Firefox
- Dia Browser (best-effort via Chromium-family fallback)
Search engines
- Bing
- DuckDuckGo
- Brave Search
- Yahoo
- SearXNG (if configured)
Configuration
Create ~/.pi/free-web-search.json:
{
"mode": "auto",
"httpFirst": true,
"browserFallbackThreshold": 0.55,
"preferredEngine": "yahoo",
"locale": "en-US",
"language": "en"
}Project-local override is also supported:
.pi/free-web-search.jsonConfiguration reference
| Field | Type | Default | Notes |
|---|---|---|---|
| mode | auto \| visible \| headless \| ask \| disabled | auto | Global browser execution policy (ask prompts before browser automation in Pi UI) |
| preferredBrowser | browser family | detected | Force browser family |
| preferredEngine | search engine id | yahoo | Force search engine |
| locale | string | system locale | Locale/market hint for engines that support it (for example Bing mkt) |
| language | string | system language | Language hint for engines that support it (for example Yahoo/Google hl) |
| searchTemplateUrl | string | per engine | Custom search URL template |
| browserExecutablePath | string | auto-resolved | Explicit browser executable |
| chromiumProfilePath | string | auto | Chromium-family profile path |
| firefoxProfilePath | string | auto | Firefox profile path |
| searxngBaseUrl | string | unset | Base URL for SearXNG |
| httpFirst | boolean | true | Skip HTTP path when false |
| browserFallbackThreshold | number | 0.55 | Quality threshold for fallback |
| httpTimeoutMs | number | 10000 | Timeout for HTTP search/fetch |
| browserNavigationTimeoutMs | number | 12000 | Browser navigation timeout |
| browserResultWaitMs | number | 700 | Additional wait for dynamic result content |
| contentMinMarkdownLength | number | 200 | Minimum extraction size before browser fallback |
| includeContentMinScore | number | 2 | Skip low-relevance search results when includeContent=true |
| maxContentFetchConcurrency | number | 2 | Max parallel content fetches when includeContent=true |
| engineHealthCooldownMs | number | 600000 | How long session engine failures remain cooled down before retry |
| engineFailureThreshold | number | 2 | Consecutive failures before a session temporarily skips an engine |
| userAgent | string | bundled UA | Override request UA |
Usage examples in Pi
free_web_search({ query: "Bun runtime documentation", numResults: 5 })
free_web_search({ query: "React server components caching", includeContent: true })
free_web_search({ query: "Supabase RLS docs", domainFilter: ["supabase.com"] })
free_web_search({ query: "OpenAI Responses API reference", engine: "yahoo", mode: "headless", debug: true })
free_fetch_content({ url: "https://bun.sh/docs" })For manual diagnostics inside Pi:
/free-search-debug OpenAI Responses API documentationDevelopment
bun install
bun run typecheck
bun test
bun run check
bun run smoke
# CI-safe smoke mode (no browser automation)
FREE_WEB_SMOKE_MODE=disabled FREE_WEB_SMOKE_ALLOW_OFFLINE=1 bun run smokeOpen source project health
This repository includes the standard community health files and templates:
CONTRIBUTING.mdCODE_OF_CONDUCT.mdSECURITY.mdSUPPORT.md- Issue templates
- PR template
- Release configuration
- Changelog
Notes
- v0.x focuses on normal web pages, not YouTube/PDF/GitHub-specialized extraction flows.
- Browser and engine detection are best-effort and can be overridden in config.
- Safari automation uses Playwright WebKit instead of directly controlling Safari binaries.
- The package is authored and tested with Bun.
License
MIT — see LICENSE.
