@yya007/skill-finder

v0.1.1

Published

9 days ago

Universal agent skill discovery — search 33,000+ curated agent skills from all major registries using natural language. Runs entirely locally via FAISS. No API calls, no latency.

Downloads

158

0High
0Medium
0Low

yya007

ai agent skills skill-finder skill-discovery claude claude-code codex openclaw faiss vector-search semantic-search llm-tools

SkillFinder

Find the right agent skill in seconds — local search, no API needed.

SkillFinder searches 37,500+ curated agent skills from all major registries using natural language. Everything runs locally: no API calls, no latency, no cost per query. Works with any agent that supports SKILL.md — Claude Code, OpenClaw, Codex, and more.

You: /skill-finder deploy kubernetes clusters with rollback

Agent: Found 3 skills for "deploying kubernetes clusters":

1. **k8s-deployer** ⭐ 142 stars — `skillsmp`
   Deploy and manage Kubernetes clusters with automated rollbacks and blue-green deployments.
   Install: `/plugin install k8s-deployer`
   If this command fails, visit the Skill link and follow the repository's own install instructions.
   Skill: https://github.com/user/k8s-deployer/blob/main/SKILL.md

2. **helm-chart-manager** ⭐ 89 stars — `skillsmp`
   Manage Helm chart lifecycle: install, upgrade, diff, and rollback.
   Install: `/plugin install helm-chart-manager`
   If this command fails, visit the Skill link and follow the repository's own install instructions.
   Skill: https://github.com/user/helm-chart-manager/blob/main/SKILL.md

Want me to fetch the full SKILL.md for any of these before you install?

Or just describe what you need — the skill also triggers automatically from natural language: "find a skill for deploying kubernetes clusters", "is there a skill for SQL migrations", etc.

Why not just Google it?

The Agent Skills open standard is supported by Claude Code, Cursor, VS Code Copilot, GitHub Copilot, OpenAI Codex, Gemini CLI, Goose, Roo Code, and other tools. Thousands of SKILL.md files exist across GitHub — with no unified way to find them.

Searching the web manually:

$ # Google: "kubernetes deploy claude code skill"
→ 2,840,000 results — blog posts, Stack Overflow, unrelated GitHub repos
→ No quality signals: is this repo maintained? 5 stars or 5,000?
→ No install commands visible in results
→ GitHub code search requires login; finds files, not skills as units
→ May take 20–30 minutes to find 3 relevant options — if they exist at all

SkillFinder:

You: /skill-finder deploy kubernetes clusters with rollback

Agent: Found 3 skills for "deploying kubernetes clusters":

1. k8s-deployer  ⭐ 142
   Deploy and manage Kubernetes clusters with rollbacks and blue-green deploys.
   Install: /plugin install k8s-deployer

2. helm-chart-manager  ⭐ 89
   Manage Helm chart lifecycle: install, upgrade, diff, and rollback.
   Install: /plugin install helm-chart-manager

3. terraform-k8s  ⭐ 61
   Provision Kubernetes infrastructure on AWS/GCP/Azure via Terraform.
   Install: /plugin install terraform-k8s

Or query the index directly from the CLI:

$ python scripts/search.py "deploy kubernetes clusters" --no-json --propose 5

Results in < 200 ms, ranked by semantic relevance and community trust, install commands included.

For end users — install and search

Prerequisites

Python 3.10+
Ollama installed locally

Install

Option 1 — npx (easiest, auto-detects your agent):

npx skills add yya007/SkillFinder

Uses the skills CLI — detects which agents you have installed and copies files to the right directory automatically.

Option 2 — npm:

npm install -g @yya007/skill-finder

Then copy to your agent's skills directory:

| Agent | Skills directory | |-------|-----------------| | Claude Code | ~/.claude/skills/skill-finder | | OpenClaw | ~/.openclaw/skills/skill-finder | | Codex | ~/.codex/skills/skill-finder |

# Copy once (pick your agent's dir from the table above):
cp -r "$(npm root -g)/@yya007/skill-finder" ~/.claude/skills/skill-finder

Finish setup (all platforms):

cd ~/.claude/skills/skill-finder   # or your platform's skills dir
pip install -r scripts/requirements.txt
ollama pull qwen3-embedding:0.6b

That's it. Ask your agent: "find a skill for X" or invoke the skill explicitly "/skill-finder"

OpenClaw: ClawHub listing coming soon. Until then, install via npm or git clone https://github.com/yya007/SkillFinder ~/.openclaw/skills/skill-finder.

# Claude Code
git clone https://github.com/yya007/SkillFinder ~/.claude/skills/skill-finder

# Codex
git clone https://github.com/yya007/SkillFinder ~/.codex/skills/skill-finder

# OpenClaw
git clone https://github.com/yya007/SkillFinder ~/.openclaw/skills/skill-finder

Then run the finish-setup block above.

Usage — natural language (recommended)

When using an agent (Claude Code, Codex, OpenClaw, etc.), just describe what you need:

"find a skill for deploying kubernetes clusters"
"is there a skill that writes and runs SQL migrations"
"what skills are available for web scraping"
"compare skills for Terraform infrastructure"

Usage — CLI

CLI vs agent: scripts/search.py returns a raw vector-similarity ranking. The agent layer (query expansion, tiered fallback, reranking by intent) is not replicated here. Use the CLI for scripting or development; use the agent integration for discovery in normal use.

cd ~/.claude/skills/skill-finder

# Search all skills (all platforms, recommended)
python scripts/search.py "deploy kubernetes clusters" --no-json

# Optionally filter to a specific platform
python scripts/search.py "deploy kubernetes clusters" --platform claude_code

# Require skills that passed ClawHub safety scan
python scripts/search.py "web scraping" --safety_only

# Filter by minimum star count
python scripts/search.py "ci/cd pipeline" --min_stars 50

# Human-readable output instead of JSON
python scripts/search.py "pptx presentation" --no-json --propose 5

Developer tools (fetch_skill.py, update_index.py) are not included in the npm package — they are available in the git repo for contributors and operators rebuilding the index.

The --platform flag is optional and accepts: claude_code, openclaw, codex.

What gets indexed (and what gets filtered out)

SkillFinder indexes skills from five sources: SkillsMP (GitHub code search), ClawHub/OpenClaw (awesome list + org/topic search), SkillHub (web scrape), the Anthropic official marketplace, and GitHub topic tags (claude-skill, codex-skill, agent-skill, etc.).

A skill is kept if it passes both of the following:

Has a non-empty description (from SKILL.md frontmatter or README)
≥ 10 GitHub stars

A skill is dropped if either condition is missing — registry membership and SkillHub ratings do not override the star threshold.

Safety: ClawHub records carry a safety_scan result from VirusTotal. Other sources do not. Always review a skill's repository before installing it.

For developers — build the index from scratch

Use this if you want to run the full crawl-embed-index pipeline locally, add a new registry, or contribute to the project.

Additional prerequisites

A GitHub personal access token with public_repo read scope (for crawlers)
Ollama with qwen3-embedding:0.6b (same model used at runtime)

Setup

git clone https://github.com/yya007/SkillFinder
cd SkillFinder
pip install -r requirements-dev.txt
export GITHUB_TOKEN=ghp_your_token_here

Step 1 — Crawl registries

Each crawler writes a raw JSONL file to data/raw/. Run them independently; they handle rate limits automatically.

# SkillsMP (GitHub code search for SKILL.md files)
python -m crawlers.skillsmp_crawler -o data/raw/skillsmp.jsonl

# ClawHub / OpenClaw (awesome list + org/topic discovery, requires GITHUB_TOKEN)
python -m crawlers.clawhub_crawler -o data/raw/clawhub.jsonl

# GitHub topic search (claude-skill, codex-skill, agent-skill, etc.)
python -m crawlers.topic_crawler -o data/raw/topic.jsonl --data-dir data/raw

# SkillHub
python -m crawlers.skillhub_crawler -o data/raw/skillhub.jsonl

# Anthropic official marketplace
python -m crawlers.marketplace_crawler -o data/raw/marketplace.jsonl

Each crawler accepts --limit N (cap records for testing) and --log-level DEBUG.

Step 2 — Normalize and deduplicate

python pipeline/normalize.py -o data/unified_skills.jsonl

Merges all raw sources, deduplicates by canonical repo URL, applies quality filters, and builds embedding text.

Step 3 — Embed

python pipeline/embed.py \
  --cache-embeddings data/embeddings.npy \
  --cache-ordered data/unified_skills_ordered.jsonl \
  --progress-file data/embed_progress.jsonl

Calls local Ollama (qwen3-embedding:0.6b) to embed all skills. Writes data/embeddings.npy.

--cache-embeddings / --cache-ordered: reuse vectors from a previous run — only new or changed skills are sent to Ollama. Omit on a fresh build.
--progress-file: write a per-batch recovery file so a crash mid-run can be resumed by re-running the same command. Deleted automatically on success.

Step 4 — Build FAISS index

python pipeline/build_index.py

Produces data/index.faiss and data/metadata.jsonl. These are the runtime index files committed to the repo and also published as a GitHub Release artifact for weekly updates.

Run tests

# Unit + integration tests (no Ollama or network required)
pytest tests/ -v

# Full quality benchmark (requires data/index.faiss and Ollama running)
pytest tests/quality/ -v -m quality

Contributing

Fork the repo and create a feature branch.
Add or update tests for any changed behaviour.
Run pytest tests/ -v — all tests must pass.
Open a pull request with a clear description of the change.

Coverage

| Registry | Crawler | Skills in index | |----------|---------|----------------:| | SkillsMP (GitHub code search) | skillsmp_crawler.py | 387 | | ClawHub / OpenClaw | clawhub_crawler.py | 4,610 | | SkillHub | skillhub_crawler.py | 7,615 | | Anthropic official marketplace | marketplace_crawler.py | 28,240 | | GitHub topics | topic_crawler.py | 15,351 | | Total (after dedup) | | 37,962 |

Star Distribution

| Stars | Skills | Distribution | |-------|-------:|:-------------| | 10–49 | 4,135 | ██░░░░░░░░░░░░░░░░░░ 11% | | 50–99 | 1,847 | █░░░░░░░░░░░░░░░░░░░ 5% | | 100–499 | 16,231 | █████████░░░░░░░░░░░ 43% | | 500–999 | 1,144 | █░░░░░░░░░░░░░░░░░░░ 3% | | 1k–5k | 9,169 | █████░░░░░░░░░░░░░░░ 24% | | 5k+ | 5,436 | ███░░░░░░░░░░░░░░░░░ 14% | | Total | 37,962 | |

Not shown: skills with 0–9 stars are dropped from the index regardless of which registry they come from. A non-empty description is also required.

How it works

Weekly CI pipeline (GitHub Actions): crawls all registries → deduplicates → embeds with Qwen3-Embedding-0.6B via Ollama → builds FAISS index → commits the updated index to the repo as a weekly data release.
Runtime (your machine): query is embedded locally via Ollama → FAISS nearest-neighbor search (< 200 ms on CPU) → candidate pool returned to the agent → agent reranks and presents the best matches.
Deep dive: the agent can fetch the raw SKILL.md from any result's repo before you install it.

The same embedding model (qwen3-embedding:0.6b) is used in CI and at runtime — the index is always compatible.

See docs/architecture.md for the full technical design.

Requirements

| Requirement | Version | |-------------|---------| | Python | 3.10+ | | numpy | ≥ 1.26 | | faiss-cpu | ≥ 1.8 | | requests | ≥ 2.31 | | pyyaml | ≥ 6.0 | | Ollama | latest | | qwen3-embedding:0.6b | via Ollama |

Acknowledgements

SkillFinder would not be possible without the registries and communities that host and curate agent skills:

| Source | What it provides | |--------|-----------------| | SkillsMP / GitHub | Open-indexed SKILL.md files discovered via GitHub code search and topic tags | | ClawHub / OpenClaw | Community-curated awesome list of OpenClaw skills with safety scan metadata | | SkillHub | Ranked and editorially reviewed skill registry | | Anthropic marketplace | Official Claude skills maintained by Anthropic |

Thank you to every skill author who publishes their work openly.

Star this repo

If SkillFinder saves you time, please star this repo — it helps others discover the project and motivates continued development.

Also consider starring the skills you find useful — it's the best way to support their authors.