npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@steipete/summarize

v0.4.0

Published

Link → clean text → summary.

Readme

Summarize 👉 Point at any URL or file. Get the gist.

Fast CLI for summarizing anything you can point at:

  • Web pages (article extraction; Firecrawl fallback if sites block agents)
  • YouTube links (best-effort transcripts, optional Apify fallback)
  • Remote files (PDFs/images/audio/video via URL — downloaded and forwarded to the model)
  • Local files (PDFs/images/audio/video/text — forwarded or inlined; support depends on provider/model)

It streams output by default on TTY and renders Markdown to ANSI (via markdansi). At the end it prints a single “Finished in …” line with timing, token usage, and a best-effort cost estimate (when pricing is available).

Install

Requires Node 22+.

  • npx (no install):
npx -y @steipete/summarize "https://example.com" --model google/gemini-3-flash-preview
  • Homebrew (custom tap):
brew install steipete/tap/summarize

Apple Silicon only (arm64).

Quickstart

summarize "https://example.com" --model google/gemini-3-flash-preview

Input can be a URL or a local file path:

npx -y @steipete/summarize "/path/to/file.pdf" --model google/gemini-3-flash-preview
npx -y @steipete/summarize "/path/to/image.jpeg" --model google/gemini-3-flash-preview

Remote file URLs work the same (best-effort; the file is downloaded and passed to the model):

npx -y @steipete/summarize "https://example.com/report.pdf" --model google/gemini-3-flash-preview

YouTube (supports youtube.com and youtu.be):

npx -y @steipete/summarize "https://youtu.be/dQw4w9WgXcQ" --youtube auto

What file types work?

This is “best effort” and depends on what your selected model/provider accepts. In practice these usually work well:

  • text/* and common structured text (.txt, .md, .json, .yaml, .xml, …)
    • text-like files are inlined into the prompt (instead of attached as a file part) for better provider compatibility
  • PDFs: application/pdf (provider support varies; Google is the most reliable in this CLI right now)
  • Images: image/jpeg, image/png, image/webp, image/gif
  • Audio/Video: audio/*, video/* (when supported by the model)

Notes:

  • If a provider rejects a media type, the CLI fails fast with a friendly message (no “mystery stack traces”).
  • xAI models currently don’t support attaching generic files (like PDFs) via the AI SDK; use a Google/OpenAI/Anthropic model for those.

Model ids

Use “gateway-style” ids: <provider>/<model>.

Examples:

  • openai/gpt-5.2
  • anthropic/claude-opus-4-5
  • xai/grok-4-fast-non-reasoning
  • google/gemini-3-flash-preview

Note: some models/providers don’t support streaming or certain file media types. When that happens, the CLI prints a friendly error (or auto-disables streaming for that model when supported by the provider).

Output length

--length controls how much output we ask for (guideline), not a hard truncation.

npx -y @steipete/summarize "https://example.com" --length long
npx -y @steipete/summarize "https://example.com" --length 20k
  • Presets: short|medium|long|xl|xxl
  • Character targets: 1500, 20k, 20000
  • Optional hard cap: --max-output-tokens <count> (e.g. 2000, 2k)
    • Provider/model APIs still enforce their own maximum output limits.
  • Minimums: --length numeric values must be ≥ 50 chars; --max-output-tokens must be ≥ 16.

Limits

  • Text inputs over 10 MB are rejected before tokenization.
  • Text prompts are preflighted against the model’s input limit (LiteLLM catalog), using a GPT tokenizer.

Common flags

npx -y @steipete/summarize <input> [flags]
  • --model <provider/model>: which model to use (defaults to google/gemini-3-flash-preview)
  • --timeout <duration>: 30s, 2m, 5000ms (default 2m)
  • --length short|medium|long|xl|xxl|<chars>
  • --max-output-tokens <count>: hard cap for LLM output tokens (optional)
  • --stream auto|on|off: stream LLM output (auto = TTY only; disabled in --json mode)
  • --render auto|md-live|md|plain: Markdown rendering (auto = best default for TTY)
  • --format md|text: website/file content format (default text)
  • --preprocess off|auto|always: preprocess files (only with --format md) for model compatibility (default auto)
  • --extract: print extracted content and exit (no summary) — only for URLs
    • Deprecated alias: --extract-only
  • --json: machine-readable output with diagnostics, prompt, metrics, and optional summary
  • --verbose: debug/diagnostics on stderr
  • --metrics off|on|detailed: metrics output (default on; detailed prints a breakdown to stderr)

Website extraction (Firecrawl + Markdown)

Non-YouTube URLs go through a “fetch → extract” pipeline. When the direct fetch/extraction is blocked or too thin, --firecrawl auto can fall back to Firecrawl (if configured).

  • --firecrawl off|auto|always (default auto)
  • --extract --format md|text (default text)
  • --markdown-mode off|auto|llm (default auto; only affects --format md for non-YouTube URLs)
  • Plain-text mode: use --format text.

YouTube transcripts

--youtube auto tries best-effort web transcript endpoints first. When captions aren't available, it falls back to:

  1. Apify (if APIFY_API_TOKEN is set): Uses a scraping actor (faVsWy9VTSNVIhWpR)
  2. yt-dlp + Whisper (if YT_DLP_PATH is set): Downloads audio via yt-dlp, transcribes with OpenAI Whisper if OPENAI_API_KEY is set, otherwise falls back to FAL (FAL_KEY)

Environment variables for yt-dlp mode:

  • YT_DLP_PATH - path to yt-dlp binary
  • OPENAI_API_KEY - OpenAI Whisper transcription (preferred)
  • FAL_KEY - FAL AI Whisper fallback

Apify costs money but tends to be more reliable when captions exist.

Configuration

Single config location:

  • ~/.summarize/config.json

Supported keys today:

{
  "model": "openai/gpt-5.2"
}

Precedence:

  1. --model
  2. SUMMARIZE_MODEL
  3. ~/.summarize/config.json
  4. default

Environment variables

Set the key matching your chosen --model:

  • OPENAI_API_KEY (for openai/...)
  • ANTHROPIC_API_KEY (for anthropic/...)
  • XAI_API_KEY (for xai/...)
  • GEMINI_API_KEY (for google/...)
    • also accepts GOOGLE_GENERATIVE_AI_API_KEY and GOOGLE_API_KEY as aliases

OpenRouter (OpenAI-compatible):

  • Set OPENROUTER_API_KEY=... to route openai/... models through OpenRouter
  • Use OpenRouter models via the openai/... prefix, e.g. --model openai/openai/gpt-oss-20b
  • Optional: OPENROUTER_PROVIDERS=... to specify provider fallback order (e.g. groq,google-vertex)

Example:

OPENROUTER_API_KEY=sk-or-... summarize "https://example.com" --model openai/openai/gpt-oss-20b

With provider ordering (falls back through providers in order):

OPENROUTER_API_KEY=sk-or-... OPENROUTER_PROVIDERS="groq,google-vertex" summarize "https://example.com"

Legacy: OPENAI_BASE_URL=https://openrouter.ai/api/v1 with OPENAI_API_KEY also works.

Optional services:

  • FIRECRAWL_API_KEY (website extraction fallback)
  • YT_DLP_PATH (path to yt-dlp binary for audio extraction)
  • FAL_KEY (FAL AI API key for audio transcription via Whisper)
  • APIFY_API_TOKEN (YouTube transcript fallback)

Model limits

The CLI uses the LiteLLM model catalog for model limits (like max output tokens):

  • Downloaded from: https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json
  • Cached at: ~/.summarize/cache/

Library usage (optional)

This package also exports a small library:

  • @steipete/summarize/content
  • @steipete/summarize/prompts

Development

pnpm install
pnpm check