npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@gulibs/safe-coder-cli

v0.0.3

Published

Standalone CLI tool for documentation crawling with SPA support, error detection, and code validation

Downloads

286

Readme

Safe Coder CLI

Standalone CLI tool for documentation crawling with SPA support, error detection, and code validation.

Overview

@gulibs/safe-coder-cli is an independent command-line tool that crawls documentation websites and generates structured output. It supports both static sites and Single Page Applications (SPAs) using browser automation.

This CLI is designed to work standalone or as part of the Safe Coder ecosystem, where it's called by the @gulibs/safe-coder MCP Server.

Features

  • HTTP & Browser Crawling: Supports both static HTTP crawling and browser-based rendering for SPAs
  • Intelligent Content Extraction: Cleans and structures documentation content
  • Parallel Processing: Multi-worker support for faster crawling
  • Progress Reporting: Real-time progress updates via stderr
  • JSON Output: Machine-readable JSON output for programmatic use
  • Skill Generation: Generates AI-ready SKILL files from documentation
  • Checkpoint Support: Resume interrupted crawls
  • Proxy Support: Configure HTTP/HTTPS proxies

Installation

Global Installation (Recommended)

npm install -g @gulibs/safe-coder-cli

Or using yarn:

yarn global add @gulibs/safe-coder-cli

Or using pnpm:

pnpm add -g @gulibs/safe-coder-cli

Verify Installation

safe-coder-cli --version
safe-coder-cli --help

Usage

Basic Crawl

safe-coder-cli crawl https://react.dev

Crawl with Options

# Limit pages and depth
safe-coder-cli crawl https://react.dev --max-pages 50 --max-depth 3

# Use multiple workers for faster crawling
safe-coder-cli crawl https://react.dev --workers 5

# Force browser automation for SPAs
safe-coder-cli crawl https://spa-site.com --spa-strategy auto --browser playwright

# Save output to directory
safe-coder-cli crawl https://react.dev --output-dir ./skills

JSON Output (for MCP Integration)

# Output machine-readable JSON
safe-coder-cli crawl https://react.dev --output-format json

# Capture output to file
safe-coder-cli crawl https://react.dev --output-format json > output.json

Command Reference

crawl <url> [options]

Crawl documentation website and optionally generate skill file.

Options

  • -c, --config <path> - Path to configuration file
  • -b, --browser <type> - Browser type: puppeteer | playwright
  • -d, --max-depth <number> - Maximum crawl depth (default: 3)
  • -p, --max-pages <number> - Maximum number of pages to crawl (default: 50)
  • -w, --workers <number> - Number of parallel workers (default: 1)
  • --spa-strategy <type> - SPA strategy: smart | auto | manual (default: smart)
  • -o, --output-dir <path> - Output directory for skill files
  • -f, --filename <name> - Skill name for directory and file names
  • --checkpoint - Enable checkpoint/resume functionality
  • --resume - Resume from last checkpoint if available
  • --rate-limit <ms> - Delay in milliseconds between requests (default: 500)
  • --output-format <format> - Output format: json | pretty (default: pretty)
  • --include-paths <paths> - Additional path patterns to include (comma-separated)
  • --exclude-paths <paths> - Path patterns to exclude (comma-separated)

detect-errors <file> [options]

Detect errors and warnings in code files.

safe-coder-cli detect-errors ./src/app.ts
safe-coder-cli detect-errors ./src/app.ts --format json

validate-code <file> [options]

Validate and optionally fix code errors.

safe-coder-cli validate-code ./src/app.ts
safe-coder-cli validate-code ./src/app.ts --output ./src/app.fixed.ts

Configuration File

Create a .doc-crawler.json file in your project root:

{
  "browser": "puppeteer",
  "spaStrategy": "smart",
  "crawl": {
    "maxDepth": 3,
    "maxPages": 200,
    "workers": 5,
    "rateLimit": 300,
    "checkpoint": {
      "enabled": true,
      "interval": 50
    }
  },
  "proxy": "http://127.0.0.1:7890"
}

Output Format

JSON Output Structure

When using --output-format json, the CLI outputs:

{
  "success": true,
  "data": {
    "source": {
      "url": "https://react.dev",
      "crawledAt": "2024-01-15T10:30:00.000Z",
      "pageCount": 50,
      "depth": 3
    },
    "pages": [
      {
        "url": "https://react.dev/learn",
        "title": "Learn React",
        "content": "...",
        "wordCount": 1500,
        "codeBlocks": 5,
        "headings": ["Getting Started", "Components"]
      }
    ],
    "metadata": {
      "technology": "react.dev",
      "categories": ["tutorial", "api", "guide"]
    },
    "statistics": {
      "totalPages": 50,
      "maxDepthReached": 3,
      "errors": 0
    },
    "skill": {
      "skillMd": "...",
      "quality": 85
    }
  }
}

Progress Output (stderr)

Progress information is output to stderr in JSON format:

{"type":"progress","message":"Crawled 10/50 pages","timestamp":"...","current":10,"total":50,"percentage":20}

Browser Setup

For SPA crawling, you need Chrome/Chromium installed:

macOS

brew install --cask google-chrome

Windows

winget install Google.Chrome

Linux

sudo apt install google-chrome-stable

Custom Browser Path

export CHROME_PATH=/path/to/chrome

Environment Variables

  • CHROME_PATH - Path to Chrome executable
  • HTTP_PROXY - HTTP proxy URL
  • HTTPS_PROXY - HTTPS proxy URL
  • LOG_LEVEL - Log level (INFO, DEBUG, ERROR)

Integration with MCP Server

The CLI is designed to be called by @gulibs/safe-coder MCP Server. The MCP Server:

  1. Checks if CLI is installed
  2. Spawns CLI with appropriate parameters
  3. Monitors progress via stderr
  4. Parses JSON output from stdout
  5. Post-processes results and generates SKILL guidance

Examples

Simple Documentation Crawl

safe-coder-cli crawl https://docs.example.com --max-pages 30

Fast Parallel Crawl

safe-coder-cli crawl https://docs.example.com --workers 8 --max-pages 200

SPA Site with Browser

safe-coder-cli crawl https://spa-site.com --spa-strategy auto --browser playwright

Generate SKILL and Save

safe-coder-cli crawl https://react.dev \
  --output-dir ~/.cursor/skills \
  --filename react-docs \
  --max-pages 100

JSON Output for Scripting

safe-coder-cli crawl https://docs.example.com \
  --output-format json \
  --max-pages 20 > output.json

# Process with jq
cat output.json | jq '.data.statistics'

Troubleshooting

CLI Not Found

After installation, if safe-coder-cli is not found:

# Check npm global bin path
npm config get prefix

# Add to PATH if needed (macOS/Linux)
export PATH="$(npm config get prefix)/bin:$PATH"

Browser Not Found

If you see "Chrome/Chromium not found":

  1. Install Chrome (see Browser Setup above)
  2. Set CHROME_PATH environment variable
  3. Or install full puppeteer: npm install -g puppeteer

Permission Errors

On Linux/macOS, you may need sudo for global installation:

sudo npm install -g @gulibs/safe-coder-cli

Or use a version manager like nvm to avoid sudo.

Development

# Clone repository
git clone <repository-url>
cd safe-coder-cli

# Install dependencies
npm install

# Build
npm run build

# Link for local testing
npm link

# Test
safe-coder-cli --version

License

MIT

Related Projects