npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@koichikawamura/omni-fetcher

v0.1.1

Published

An MCP server for fetching web content in whatever form you need. Every page is first rendered with a headless Chromium (Playwright) so JavaScript-rendered content is captured, then returned as clean Markdown, raw HTML, or a screenshot. Rendered HTML is c

Readme

omni-fetcher

An MCP server for fetching web content in whatever form you need. Every page is first rendered with a headless Chromium (Playwright) so JavaScript-rendered content is captured, then returned as clean Markdown, raw HTML, or a screenshot. Rendered HTML is cached per URL/proxy, and proxies can be referenced by short id from a local database.

Tool: extract

| Argument | Type | Required | Description | |---|---|:-:|---| | url | string | yes | URL of the page to fetch | | format | enum | no | Output form (default mercury). One of rendered_html, mercury, defuddle, screenshot. | | proxy | string | no | Full proxy URL (http://host:port, socks5://host:port, http://user:pass@host:port) or a proxy id registered in the proxy database. Overrides MERCURY_PROXY for this call. |

Formats (cheap → expensive)

Pick the cheapest form that meets the need and escalate only if it fails or is insufficient:

| Format | Returns | When to use | |---|---|---| | mercury | Article Markdown via Mercury Parser | Default; clean article text. | | defuddle | Article Markdown via Defuddle | When Mercury misses content — a second opinion extractor. | | rendered_html | Full rendered HTML | When you need raw markup, or both extractors fail. (Large output.) | | screenshot | Full-page PNG (base64 image) | Most expensive; when visual layout matters or text extraction fails. |

mercury, defuddle, and rendered_html all reuse the same cached rendered HTML and follow pagination links automatically. screenshot always drives a live browser and is not cached.

The headless browser runs through playwright-extra with the stealth plugin, which masks the automation fingerprint (navigator.webdriver, window.chrome, WebGL vendor, …) that some sites use to detect and reset bots. Navigation waits on domcontentloaded and retries once on transient connection resets. This still won't beat the strongest anti-bot stacks — pairing with a residential proxy helps there.

Tool: list_proxies

Returns the registered proxies (id, url, location) so you know which ids you can pass to extract.

Proxy database

Proxies are seeded from a JSON config file at startup (see proxies.example.json):

[
  { "id": "jp-tokyo", "url": "socks5://localhost:1080", "location": "Tokyo, JP" },
  { "id": "us-east", "url": "http://user:[email protected]:3128", "location": "Virginia, US" }
]

Save it as ~/.omni-fetcher/proxies.json (the default location — created on first run; or point OMNI_PROXIES_FILE / OMNI_DATA_DIR elsewhere). After that you can call extract with proxy: "jp-tokyo" instead of the full URL. A proxy value containing :// is always treated as a literal URL; otherwise it is looked up as an id.

Environment variables

| Variable | Default | Description | |---|---|---| | MCP_TRANSPORT | stdio | stdio or http (Streamable HTTP). | | MCP_HOST | 127.0.0.1 | Bind address when MCP_TRANSPORT=http. | | MCP_PORT | 3000 | Bind port when MCP_TRANSPORT=http. | | MCP_PATH | /mcp | URL path when MCP_TRANSPORT=http. | | MERCURY_PROXY | — | Default proxy for the headless browser, used when no proxy argument is given. | | OMNI_DATA_DIR | ~/.omni-fetcher | Directory holding the SQLite DB and proxy config. Used so npx works regardless of cwd. | | OMNI_DB_PATH | <OMNI_DATA_DIR>/omni-fetcher.db | SQLite file holding the render cache and proxy database. | | OMNI_CACHE_TTL | 86400 | Rendered-HTML cache lifetime, in seconds. | | OMNI_PROXIES_FILE | <OMNI_DATA_DIR>/proxies.json | JSON file seeding the proxy database. |

Usage

Claude Desktop (local stdio)

{
    "mcpServers": {
        "omni-fetcher": {
            "command": "npx",
            "args": ["@koichikawamura/omni-fetcher", "omni-fetcher-mcp"]
        }
    }
}

No paths to configure: the SQLite cache and proxy database default to ~/.omni-fetcher/, so this works under npx regardless of the working directory. Drop a proxies.json in that directory to register proxies by id.

Remote MCP (Streamable HTTP)

MCP_TRANSPORT=http \
MCP_HOST=127.0.0.1 \
MCP_PORT=3030 \
npx @koichikawamura/omni-fetcher omni-fetcher-mcp

Clients reach it at http://<host>:<port>/mcp. Pair with mcp-remote on the client side, or register directly as a connector in Claude.ai / ChatGPT if the endpoint is gated by an OAuth-aware proxy.

CLI (ad-hoc testing)

node extractContent.js <url> [format] [proxy]
# e.g.
node extractContent.js https://example.com defuddle
node extractContent.js https://example.com screenshot   # prints base64 PNG to stdout

Requirements

  • Node.js 22.5+ (uses the built-in node:sqlite module — no native build step)
  • Playwright Chromium (auto-installed on first launch if missing)

License

MIT — see LICENSE.