npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@talent-scout/data-collector

v0.1.1

Published

[![GitHub Actions](https://github.com/presence-io/talent-scout/actions/workflows/publish.yml/badge.svg)](https://github.com/presence-io/talent-scout/actions/workflows/publish.yml) [![npm: @talent-scout/data-collector](https://img.shields.io/npm/v/%40talen

Readme

@talent-scout/data-collector

GitHub Actions npm: @talent-scout/data-collector Node.js License: MIT

@talent-scout/data-collector 负责建立候选池。它从 GitHub、社区仓库、榜单页面和 AI 工具生态里收集信号,并把这些原始线索写入 workspace-data/output/raw/<timestamp>/

开发前提

  • Node.js 22+
  • pnpm 10+
  • gh 已安装并登录
  • Playwright 依赖可正常运行

统一在仓库根目录安装依赖:

pnpm install

常用命令

pnpm --filter @talent-scout/data-collector run collect
pnpm --filter @talent-scout/data-collector run build

命令应从仓库根目录发起。包内部会利用 INIT_CWD 和共享工作区解析逻辑,把输出写到根目录的 workspace-data/

采集范围

  • 代码仓库中的 AI 工具使用信号,例如 AGENTS.mdcopilot-instructions.md.cursorrules
  • commit 历史中的协作和 AI 使用痕迹
  • claude-codemcp-servercopilot 等 topic 相关仓库
  • 社区仓库的 contributor、stargazer、fork 网络
  • 排行榜和种子列表

关键模块

  • src/index.ts: 采集入口,负责恢复已有结果或启动新的 run
  • src/github-signals.ts: GitHub 搜索线索
  • src/community.ts: 社区仓库信号
  • src/rankings.ts: 排行榜抓取
  • src/stargazers.ts: AI 工具相关仓库的 stargazer 采集
  • src/follower-graph.ts: 关注关系扩展的占位实现
  • src/query.ts: 读取已落盘的 raw 数据

设计思想

1. 采集要覆盖“显式信号”和“弱信号”

如果只看 AI 工具仓库的 star,很容易把“好奇用户”和“真实高质量开发者”混在一起。所以这个包同时采集:

  • 显式信号:配置文件、commit 文本、社区贡献
  • 弱信号:stargazer、topic 相关仓库、榜单曝光

处理阶段再决定这些信号如何加权。

2. 采集过程必须可恢复

每个采集器都遵循“先检查本地输出,缺失时才执行”的模式。这样一旦排行榜抓取或 GitHub 搜索在中途失败,你不需要从头重新跑整轮采集。

3. 工具生态要保持平权

配置里的默认设计把 Claude、Copilot、Cursor、Cline、Windsurf 等工具视为同级信号来源,而不是只偏向某一个生态。这避免候选池被单一社区习惯扭曲。

实现流程

flowchart LR
  A[talents.yaml] --> B[GitHub Signals]
  A --> C[Community Signals]
  A --> D[Ranking Sources]
  A --> E[Stargazers]
  B --> F[raw/<timestamp>]
  C --> F
  D --> F
  E --> F

当前边界

  • GitHub Search API 仍然受分页上限约束
  • 排行榜抓取依赖页面结构,源站改版时需要修复解析器
  • follower-graph.ts 目前还不是完整生产实现

相关文档