npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

crawler-scanner-sdk

v2.1.0

Published

A Node.js SDK for integrating crawlergo, xray, and req probe tools

Readme

Crawler Scanner SDK

一个易用、易修改的 Node.js SDK,用于操作 bin 目录下的安全扫描工具。

特性

  • 🎯 简单易用:清晰的 API 设计,易于理解和使用
  • 🔧 易于修改:代码结构清晰,每个工具独立封装
  • 📦 完整封装:封装所有 bin 工具(crawlergo, clean, probe, getSample)
  • 📊 类型定义:完整的 TypeScript 类型定义

安装

npm install crawler-scanner-sdk

快速开始

方式一:使用 SecurityScanner(推荐)

const { SecurityScanner } = require("crawler-scanner-sdk");

// 创建扫描器实例
const scanner = new SecurityScanner({
  binPath: "/path/to/bin", // bin 目录路径(必需)
  chromePath: "/path/to/chrome", // Chrome 路径(可选)
  xrayPath: "/path/to/xray", // xray 可执行文件路径(可选,用于漏洞扫描)
  storageType: "file", // 存储类型: 'file'
  storagePath: "./results", // 存储路径(可选,默认值:'./scanner-results')
});

// 1. 运行爬虫
const crawlResult = await scanner.runCrawlergo("https://example.com", {
  maxCrawlCount: 100,
});

// 2. 运行探测(重放模式)
const probeResult = await scanner.runProbe(crawlResult.outputFile, {
  enableReplay: true,
  timeout: 5.0,
  concurrency: 10,
});

// 3. 清理结果
const cleanResult = await scanner.runClean(probeResult.outputFile, {
  keywords: ["error", "not found"],
});

// 4. 提取样本
const sampleResult = await scanner.runGetSample(probeResult.outputFile);
console.log("Samples:", sampleResult.result.resp_data_sample);

// 5. 运行 xray 扫描(需要先配置 xrayPath)
const xrayResult = await scanner.runXray(["http://example.com/api/users"], {
  outputFile: "/path/to/xray.json",
  plugins: ["xss", "sqldet", "cmd-injection"],
});
console.log("Found vulnerabilities:", xrayResult.result.length);

方式二:直接使用工具类

const {
  CrawlergoTool,
  ProbeTool,
  CleanTool,
  GetSampleTool,
} = require("crawler-scanner-sdk");

// 直接调用工具
const crawlResult = await CrawlergoTool.run("https://example.com", {
  crawlergoPath: "/path/to/bin/crawlergo",
  outputFile: "./crawler.json",
});

const probeResult = await ProbeTool.run("./crawler.json", {
  probePath: "/path/to/bin/probe",
  enableReplay: true,
});

数据模型

爬虫输出模型(CrawlergoOutput)

interface CrawlergoOutput {
  req_list: CrawlergoRequest[];
  sub_domain_list: string[];
}

interface CrawlergoRequest {
  url: string;
  method: string;
  req_headers: Record<string, any> | null;
  req_data: string;
  has_param: boolean;
  source: string;
  is_relative: boolean;
}

接口预测模型(ProbeResult)

interface ProbeResult {
  url: string;
  method: string;
  req_headers: Record<string, any> | null;
  req_data: string;
  status_code: number;
  resp_headers: Record<string, string>;
  resp_data: string;
  error?: string;
}

响应数据摘要模型(RespDataSample)

interface RespDataSample {
  resp_data_sample: string[];
}

API 文档

SecurityScanner

runCrawlergo(targetUrls, options?)

运行 crawlergo 爬虫。

参数:

  • targetUrls: 目标 URL(字符串或数组)
  • options: 选项对象
    • maxTabCount: 最大标签页数量
    • maxCrawlCount: 最大爬取数量
    • maxRuntime: 最大运行时间(秒)
    • filterMode: 过滤模式 ('simple' | 'smart')
    • proxy: 代理地址
    • outputFile: 输出文件路径
    • taskId: 自定义任务 ID

返回:

{
  taskId: string;
  result: CrawlergoOutput;
  outputFile: string;
}

runProbe(crawlergoFile, options?)

运行 probe 探测。

参数:

  • crawlergoFile: crawlergo 输出文件路径
  • options: 选项对象
    • enableProbe: 启用探测模式(relative 请求)
    • enableReplay: 启用重放模式(has_param、非 GET、XHR 请求)
    • apiBase: API 基础 URL
    • timeout: 超时时间(秒)
    • concurrency: 并发数
    • outputFile: 输出文件路径

返回:

{
  taskId: string;
  results: ProbeResult[];
  outputFile: string;
}

runClean(inputFile, options?)

运行 clean 清理。

参数:

  • inputFile: 输入文件路径(probe 输出的 JSONL)
  • options: 选项对象
    • keywords: 关键字列表(字符串逗号分隔或字符串数组),用于过滤无效响应
    • outputFile: 输出文件路径

返回:

{
  results: ProbeResult[];
  outputFile: string;
}

说明:

  • 如果不提供 keywords,执行大规模清理
  • 如果提供 keywords,使用关键字进行精确过滤(第二次清理)
  • 关键字会传递给 clean 工具的 -k 参数

runGetSample(probeFile, options?)

运行 getSample 提取样本。

参数:

  • probeFile: probe 输出文件路径(JSONL)
  • options: 选项对象
    • outputFile: 输出文件路径

返回:

{
  result: RespDataSample; // { resp_data_sample: string[] }
  outputFile: string;
}

说明:

  • getSample 从 probe 输出中随机抽样响应体样本
  • 返回的 resp_data_sample 是字符串数组,每个字符串是 JSON 格式的响应体
  • 这些样本可以用于 AI 提取无效响应指纹

runXray(urls, options)

运行 xray 扫描。

参数:

  • urls: URL 数组(必需)
  • options: 选项对象(必需)
    • outputFile: 输出文件路径(必需)
    • plugins: 插件列表(可选),如 ['xss', 'sqldet', 'cmd-injection']

返回:

{
  taskId: string;
  result: any[];  // 漏洞数组
}

说明:

  • xray 输出是 JSON 格式(数组),SDK 会自动读取并返回解析后的数组
  • 如果 xray 没有发现漏洞或执行失败,返回空数组
  • 会创建临时的 URL 列表文件传递给 xray

示例:

const xrayResult = await scanner.runXray(
  ["http://example.com/api/users", "http://example.com/api/posts"],
  {
    outputFile: "/path/to/xray.json",
    plugins: ["xss", "sqldet", "cmd-injection"],
  }
);
console.log("Found vulnerabilities:", xrayResult.result.length);

代码结构

src/
├── index.js              # 主入口,导出所有类和工具
├── scanner.js            # SecurityScanner 主类
├── models.js             # 数据模型定义
├── tools/                # 工具封装类
│   ├── crawlergo.js      # Crawlergo 工具
│   ├── probe.js          # Probe 工具
│   ├── clean.js          # Clean 工具
│   └── getSample.js      # GetSample 工具
└── storage/              # 存储适配器
    └── file.js          # 文件存储

修改指南

添加新工具

  1. src/tools/ 目录下创建新的工具类文件
  2. 实现 run() 静态方法
  3. src/index.js 中导出新工具类

修改数据模型

  1. 更新 src/models.js 中的模型定义
  2. 更新 src/index.d.ts 中的 TypeScript 类型定义
  3. 更新相关工具类的数据处理逻辑

修改工具调用逻辑

每个工具类都是独立的,可以直接修改对应的 src/tools/*.js 文件。