npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@builtbyecho/research

v0.5.2

Published

BuiltByEcho Research: local-first web research, browser rendering, source audits, traces, and citation-backed reports.

Readme

BuiltByEcho Research

Local-first web research for agents: plan, search, fetch, render, rank, audit, trace, and report.

CI License: MIT Node

BuiltByEcho Research is a practical research harness for agents and developers who need reliable source discovery, citation-backed reports, and browser-aware extraction — without depending on paid scraping APIs.

It starts cheap with plain HTTP fetches, escalates to Playwright only when a page needs rendering, scores source quality, audits evidence coverage, and saves traceable JSON artifacts so every research run can be inspected later.

┌─────────┐   ┌────────┐   ┌───────────┐   ┌──────────────────────┐
│  Plan   │──▶│ Search │──▶│   Fetch   │──▶│  Render?             │
│         │   │        │   │ (cheap)   │   │  ┌─ JS-gated?        │
│ multi-  │   │ Brave  │   │          │   │  ├─ thin text?        │
│ angle   │   │ or DDG │   │ Readabil.│   │  └─ auto-escalate    │
└─────────┘   └────────┘   └───────────┘   └──────────┬───────────┘
                                                      │
                    ┌─────────────────────────────────┘
                    ▼
             ┌────────────┐   ┌────────────┐   ┌──────────┐   ┌──────────┐
             │   Score &   │──▶│   Diversify │──▶│  Audit   │──▶│  Report  │
             │   Rank     │   │  by Host   │   │          │   │  + Trace │
             └────────────┘   └────────────┘   └──────────┘   └──────────┘

Table of contents

Why this exists

Most "research agents" fail in boring ways: one search query, shallow snippets, no source quality checks, no audit trail, no repeatability, and no clear path when evidence is weak.

BuiltByEcho Research is designed around the opposite workflow:

  • Plan first — generate multiple search angles before fetching.
  • Fetch cheap first — use static HTTP + Readability before opening a browser.
  • Render when needed — use Playwright for JS-gated or thin pages.
  • Read pages semantically — capture ARIA/accessibility snapshots, not just raw HTML.
  • Rank evidence — source quality, freshness, host diversity, and extraction success all matter.
  • Audit the run — mark weak research, missing coverage, and suggested follow-ups.
  • Leave a trace — write reproducible JSON traces for debugging and review.
  • Write usable reports — generate a readable brief with findings, caveats, and sources.

Current status

Version: 0.5.2

Good for:

  • Landscape and competitive research
  • Technical and source discovery
  • Citation-backed first drafts
  • Browser-rendered extraction from JS-heavy pages
  • Structured page extraction (emails, phones, pricing, links)
  • Agent pipelines that need JSON outputs with audit trails

Not a replacement for human judgment. Treat generated prose as a strong first draft and review it before high-stakes use.

Installation

Option 1: Install from GitHub

npm install -g github:BuiltByEcho/research
builtbyecho-research --help

Alias:

echo-research --help

Option 2: Clone locally

git clone https://github.com/BuiltByEcho/research.git
cd research
npm install
npx playwright install chromium
npm test
node src/cli.mjs --help

Option 3: npm package

npm install -g @builtbyecho/research
builtbyecho-research --help

OpenClaw / Agent skill

This repo includes an OpenClaw-compatible skill at skills/builtbyecho-research/SKILL.md.

Agents can use that file as their install and usage guide for @builtbyecho/research, including npx, global npm install, browser setup, and common research commands.

Optional setup

No API key is required. Everything runs locally.

Optional Scrapling backend for adaptive/stealthier Python fetching:

python3 -m pip install 'scrapling[fetchers]'
builtbyecho-research fetch https://example.com --backend scrapling
builtbyecho-research fetch https://example.com --backend scrapling-dynamic
builtbyecho-research fetch https://example.com --backend scrapling-stealth

If Scrapling lives in a virtualenv, pass --scrapling-python /path/to/venv/bin/python or set SCRAPLING_PYTHON.

For better search discovery, add a Brave API key:

cp .env.example .env
# edit .env and set BRAVE_API_KEY=...

Without BRAVE_API_KEY, search falls back to DuckDuckGo HTML scraping. Fetch, render, crawl, extraction, reports, audits, chunking, and traces all work without it.

Quick start

Search and fetch

builtbyecho-research search "browser automation accessibility snapshots" -n 5
builtbyecho-research fetch https://example.com --max-chars 5000
builtbyecho-research fetch https://example.com --backend scrapling --max-chars 5000

Render a JavaScript-heavy page

builtbyecho-research render https://example.com

The render command returns page text, links, HTML, and a compact ARIA/accessibility snapshot.

Use a persistent browser profile

Useful for logged-in sites or pages that need cookies:

builtbyecho-research render https://app.example.com --profile default --headed

Profiles are stored under .profiles/ by default and are gitignored.

Run a multi-angle research pipeline

builtbyecho-research pipeline "research agent architecture" \
  --expand \
  --rounds 2 \
  -n 8 \
  --format markdown \
  --trace

Generate a citation-backed report

builtbyecho-research report "Playwright MCP browser automation best practices" \
  -n 6 \
  --rounds 2 \
  --trace

Run a brief (multi-angle research pass)

builtbyecho-research brief "AI agent orchestration frameworks" \
  -n 8 \
  --rounds 2 \
  --format markdown \
  --trace

Compare multiple topics side-by-side

builtbyecho-research compare "LangChain" "LlamaIndex" "CrewAI" \
  --per-query 3 \
  --format report

Extract structured fields from a page

builtbyecho-research extract https://example.com --schema links,headings
builtbyecho-research extract https://example.com --schema emails,phones,pricing,contact_links,socials

Crawl a site

builtbyecho-research crawl https://docs.example.com \
  --depth 2 \
  --max-pages 25 \
  --chunk

Commands

| Command | Purpose | |---|---| | search <query> | Search discovery using Brave API or DuckDuckGo fallback | | fetch <url> | Cheap HTTP fetch + Readability/metadata extraction | | render <url> | Playwright render + HTML/text/links + ARIA snapshot | | crawl <url> | Depth/page-limited BFS crawl with optional chunks | | plan <objective> | Deterministic multi-angle query plan | | pipeline <query> | Search → fetch/render → rank → audit → output | | brief <objective> | Multi-angle, citation-aware research pass (expand + iterative) | | report <objective> | Executive markdown report with findings/caveats/sources | | extract <url> | Local heuristic structured extraction | | compare <queries...> | Multi-query research comparison | | cache | Cache stats/purge/clear |

Key flags

Pipeline, brief, report, compare:

| Flag | Default | Purpose | |---|---|---| | -n, --count | 5–8 | Number of results to return | | --rounds | 1–2 | Iterative follow-up rounds (more rounds = deeper research) | | --expand | off | Expand query into multiple search angles | | --max-queries | 4–5 | Number of query angles when expanding | | --chunk | off | Include citation-ready chunks in output | | --trace | off | Write trace JSON under output/traces/ | | -f, --format | json | Output format: json, markdown, summary, jsonfeed, report | | --no-cache | off | Bypass cache for fresh results |

Pipeline only:

| Flag | Default | Purpose | |---|---|---| | --domains <list> | all | Comma-separated domain allowlist | | --exclude-domains <list> | none | Comma-separated domain blocklist | | --domain-cap | 2 | Max results per host after reranking | | --no-diverse | off | Disable host diversity reranking | | --no-auto-render | off | Disable Playwright escalation for JS-gated/thin pages | | --profile <name> | none | Persistent Playwright browser profile | | --profile-dir <dir> | .profiles | Base directory for profiles |

Crawl:

| Flag | Default | Purpose | |---|---|---| | --depth | 1 | Link depth to follow | | --max-pages | 20 | Maximum pages to fetch | | --same-domain | on | Stay on the same domain | | --cross-domain | off | Allow off-domain links | | --include <regexes> | none | Comma-separated URL include patterns | | --exclude <regexes> | none | Comma-separated URL exclude patterns | | --chunk | off | Include citation-ready chunks | | --chunk-chars | 1200 | Chunk size in characters | | --overlap-chars | 150 | Chunk overlap in characters |

Cache subcommands:

builtbyecho-research cache stats     # show cache stats
builtbyecho-research cache purge     # remove expired entries
builtbyecho-research cache clear     # clear everything

Output formats

Use --format with pipeline, brief, or compare:

| Format | Description | |---|---| | json | Full structured output (default) | | markdown / md | Source-by-source research report | | report | Executive report with findings and sources | | summary | Compact titles and URLs | | jsonfeed | JSON Feed v1.1 |

Example:

builtbyecho-research pipeline "open source deep research agents" --expand -n 8 --format report

Research audits

Every pipeline, brief, and report run includes an audit:

{
  "grade": "strong",
  "resultCount": 6,
  "uniqueHosts": 5,
  "highQualitySources": 6,
  "warnings": [],
  "followUpQueries": [
    "... architecture implementation patterns",
    "... limitations failure modes criticism"
  ]
}

Grades:

  • strong — enough source diversity and high-quality evidence
  • needs-review — useful but has gaps worth checking
  • weak — not enough evidence; run follow-ups or change query

Traces

Add --trace to save the full run under output/traces/:

builtbyecho-research report "AI browser automation tools" --trace

A trace includes:

  • Research plan
  • Search queries issued
  • Fetched URLs
  • Source quality scores
  • Audit result
  • Citations
  • Final selected sources

Traces are useful for debugging agents, reproducing results, and explaining where a report came from.

Browser escalation

The pipeline auto-renders when cheap fetch looks weak:

  • "enable JavaScript" / "just a moment" / CAPTCHA-like signals
  • Very low visible text
  • Script-heavy page with thin text
  • Explicit render: true in library usage

Disable it for fetch-only behavior:

builtbyecho-research pipeline "topic" --no-auto-render

Use a persistent profile for sites requiring login:

builtbyecho-research pipeline "topic" --profile default --headed

Structured extraction

extract uses local heuristics — no model calls, no API keys:

| Schema field | What it finds | |---|---| | emails | Email addresses | | phones | Phone numbers | | pricing | Price patterns ($99, €49/mo, etc.) | | links | All links with text and URLs | | contact_links | Links containing "contact", "about", "support", etc. | | socials | Social media profile links | | companies | Company/organization names (from headings, meta) | | headings | Page heading hierarchy (h1–h6) |

Plus arbitrary keyword fields, returned as relevant sentences.

builtbyecho-research extract https://company.example --schema emails,phones,pricing,contact_links,socials

Add --render for pages that need Playwright:

builtbyecho-research extract https://spa.example --schema links --render

Crawling

crawl does breadth-first traversal with depth and page limits:

# Basic crawl
builtbyecho-research crawl https://docs.example.com --depth 2 --max-pages 25

# With citation-ready chunks
builtbyecho-research crawl https://docs.example.com --depth 2 --max-pages 25 --chunk

# Filter URLs with patterns
builtbyecho-research crawl https://blog.example.com --include "/post/,/article/" --exclude "/tag/,/page/"

# Allow cross-domain links
builtbyecho-research crawl https://example.com --cross-domain --depth 2

Library API

import {
  // Core pipeline
  researchPipeline,
  iterativeResearchPipeline,
  comparePipeline,

  // Search & fetch
  searchWeb,
  fetchUrl,

  // Browser rendering
  renderExtract,
  compactAccessibilitySnapshot,

  // Crawling
  crawlSite,

  // Planning & audit
  buildResearchPlan,
  auditResearchRun,

  // Scoring & diversity
  assessSourceQuality,
  diversifyByHost,
  shouldRenderEscalate,

  // Structured extraction
  extractSchema,
  extractSchemaFromUrl,
  parseSchema,

  // Output formatting
  toMarkdown,
  toSummary,
  toJsonFeed,
  toResearchReport,

  // Chunking & citations
  chunkText,
  attachCitationMetadata,
  buildCitationLedger,

  // Tracing
  writeTrace,
} from '@builtbyecho/research';

Quick examples

Run an iterative pipeline:

const result = await iterativeResearchPipeline('Playwright MCP best practices', {
  expand: true,
  count: 6,
  rounds: 2,
});

console.log(toResearchReport(result));

Extract structured data from a URL:

const extracted = await extractSchemaFromUrl('https://example.com', 'links,headings');
console.log(extracted.data);

Fetch and chunk with citations:

const { text, metadata } = await fetchUrl('https://example.com');
const chunks = chunkText(text, { chunkChars: 1200, overlapChars: 150 });
const cited = attachCitationMetadata(chunks, metadata);

Render a JS-heavy page and get its ARIA tree:

const { text, links, snapshot } = await renderExtract('https://spa.example.com');
const aria = compactAccessibilitySnapshot(snapshot);

Architecture

src/
├── cli.mjs              # Commander CLI — all commands and flags
├── index.mjs            # Public ES module exports
├── search.mjs           # Brave API + DuckDuckGo fallback
├── fetch-url.mjs        # HTTP fetch + Readability + metadata
├── render.mjs           # Playwright render + ARIA snapshots
├── crawl.mjs            # BFS crawl with depth/page limits
├── pipeline.mjs         # Full research pipeline + escalation logic
├── query-plan.mjs       # Deterministic multi-angle query planner
├── source-quality.mjs   # Source scoring + host diversity reranking
├── research-audit.mjs   # Post-run audit (grade, warnings, follow-ups)
├── report.mjs           # Citation-backed markdown report writer
├── schema-extract.mjs   # Local heuristic structured extraction
├── extractors.mjs       # Individual field extractors (email, phone, etc.)
├── chunking.mjs         # Text chunking + citation metadata
├── formatters.mjs       # Markdown, summary, JSON Feed output
├── traces.mjs           # Reproducible JSON trace writer
├── cache.mjs            # In-memory cache with TTL + stats
└── url-utils.mjs        # URL normalization and domain helpers

Data flow:

  1. search.mjs discovers URLs via Brave or DuckDuckGo
  2. fetch-url.mjs does cheap HTTP + Readability extraction
  3. render.mjs escalates to Playwright when needed
  4. source-quality.mjs scores and reranks by quality + diversity
  5. research-audit.mjs grades the run and suggests follow-ups
  6. report.mjs or formatters.mjs writes the final output
  7. traces.mjs saves the full artifact for reproducibility

Package contents

The package intentionally includes only what users need:

  • src/ — all source modules
  • examples/ — quickstart examples
  • assets/brand/ — logo assets
  • README.md
  • LICENSE
  • .env.example

It excludes local caches, traces, browser profiles, screenshots, tests, and development output.

Development

npm install
npx playwright install chromium
npm test
npm run pack:check

Run from source:

node src/cli.mjs report "open source research agents" -n 6 --rounds 2

CI

GitHub Actions runs:

  • npm ci
  • Playwright Chromium install
  • Smoke tests
  • npm pack --dry-run

Design principles

  • Built by agents, for agents — CLI-first, JSON-first, traceable.
  • Cheap first — static fetch before Playwright.
  • Local-first — no required paid scraping API.
  • Citation-aware — sources and chunks carry URLs and timestamps.
  • Auditable — plans, scores, warnings, follow-ups, and traces are first-class.
  • Composable — works as a CLI or ES module library.
  • Practical over magical — deterministic heuristics where they beat opaque model calls.

Brand

BuiltByEcho tools are meant to feel sharp, local-first, useful, and agent-native.

This is the first public research tool in that line: a small, inspectable harness that does real work and leaves receipts.

Contributing

  1. Fork the repo
  2. Create a feature branch (git checkout -b feature/my-thing)
  3. Make your changes
  4. Run npm test and npm run pack:check
  5. Open a pull request

Bug reports and feature requests welcome at GitHub Issues.

Feedback & Questions

Changelog

See CHANGELOG.md for release history.

License

MIT © BuiltByEcho