npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

pi-web-tools

v0.1.0

Published

Web search via Exa, content extraction, and GitHub repo cloning for Pi coding agent

Readme

pi-web-tools

Web search, content extraction, and GitHub repo cloning for the Pi coding agent.

A lightweight extension providing three tools:

  • web_search — Search the web via Exa with snippet extraction
  • fetch_content — Fetch any URL and extract clean markdown (HTML via Readability, Jina Reader fallback, GitHub via clone)
  • get_search_content — Retrieve stored results from previous searches/fetches

Install

pi install npm:pi-web-tools

Or install from git:

pi install github:coctostan/pi-web-tools

Setup

Exa API Key (required for web_search)

Get a key at exa.ai and set it via environment variable:

export EXA_API_KEY="your-key-here"

Or add it to the config file ~/.pi/web-tools.json:

{
  "exaApiKey": "your-key-here"
}

The environment variable takes precedence over the config file.

GitHub CLI (recommended for fetch_content)

For GitHub repo cloning, install the GitHub CLI:

# Debian/Ubuntu
sudo apt install gh

# Or via conda, brew, etc.
gh auth login

Without gh, the extension falls back to git clone (works for public repos).

Configuration

Config file: ~/.pi/web-tools.json (auto-reloaded every 30 seconds)

{
  "exaApiKey": "your-exa-key",
  "github": {
    "maxRepoSizeMB": 350,
    "cloneTimeoutSeconds": 30,
    "clonePath": "/tmp/pi-github-repos"
  }
}

| Option | Default | Description | |--------|---------|-------------| | exaApiKey | null | Exa API key (env EXA_API_KEY overrides) | | github.maxRepoSizeMB | 350 | Skip cloning repos larger than this | | github.cloneTimeoutSeconds | 30 | Abort clone after this many seconds | | github.clonePath | /tmp/pi-github-repos | Where to store cloned repos |

Tools

web_search

Search the web using Exa. Returns results with snippets and source URLs.

| Parameter | Type | Description | |-----------|------|-------------| | query | string | Single search query | | queries | string[] | Multiple queries (batch) | | numResults | number | Results per query (default: 5, max: 20) |

Example:

Search for "TypeScript 5.8 new features"

fetch_content

Fetch URL(s) and extract readable content as markdown.

| Parameter | Type | Description | |-----------|------|-------------| | url | string | Single URL to fetch | | urls | string[] | Multiple URLs (parallel, max 3 concurrent) | | forceClone | boolean | Force cloning large GitHub repos |

Content extraction pipeline:

  1. GitHub URLs → Clone repo (shallow, depth 1), generate tree + README
  2. HTML pages → Readability extraction → Markdown conversion
  3. Readability fails → Jina Reader fallback (r.jina.ai)
  4. Non-HTML → Return raw text

Content over 30,000 characters is truncated with a pointer to get_search_content.

get_search_content

Retrieve full content from a previous web_search or fetch_content result.

| Parameter | Type | Description | |-----------|------|-------------| | responseId | string | ID from a previous tool result | | query | string | Filter by query text | | queryIndex | number | Filter by query index | | url | string | Filter by URL | | urlIndex | number | Filter by URL index |

How GitHub Cloning Works

When fetch_content receives a GitHub URL:

  1. Parse — Extracts owner, repo, ref, path, type (root/blob/tree)
  2. Size check — Queries repo size via gh api. Skips if over threshold (default 350MB)
  3. Clone — Shallow clone (--depth 1) to temp directory, cached for the session
  4. Generate — Based on URL type:
    • Root: Full directory tree + README content
    • Tree: Directory listing for the specified path
    • Blob: File content (with binary detection and 100K truncation)

Non-code GitHub URLs (issues, PRs, discussions, etc.) are fetched as normal web pages.

Architecture

index.ts          — Extension entry point, 3 tools, session management
├── config.ts     — Config with 30s TTL cache, env var overrides
├── storage.ts    — LRU storage (max 50 entries, session restore)
├── exa-search.ts — Exa API client
├── extract.ts    — Readability + Jina Reader content extraction
└── github-extract.ts — GitHub URL parsing, clone, tree/content generation

Development

# Install dependencies
npm install

# Run tests
npx vitest run

# Run tests in watch mode
npx vitest

# Load in pi for testing
pi -e ./index.ts

License

MIT