@phoenixaihub/vuln-harvest
v0.1.0
Published
AI-guided vulnerability discovery framework. Agentic harness: hypothesis → PoC → verify → triage.
Downloads
81
Maintainers
Readme
VulnHarvest — AI-Guided Vulnerability Discovery Framework
Problem
Security teams lack open-source tooling to use LLMs for systematic vulnerability discovery in codebases. Mozilla internally proved AI can find 423 Firefox bugs in one month (including 15-year-old UAFs, sandbox escapes, race conditions) using Claude Mythos — but hasn't open-sourced the pipeline. Meanwhile, offense is getting cheaper (AI-assisted vulnerability scanning by attackers), and defenders need the same tooling.
Solution
Open-source agentic harness for AI-guided vulnerability discovery:
- Hypothesis Generation — LLM analyzes code patterns, generates vulnerability hypotheses
- PoC Creation — Automated proof-of-concept test generation per hypothesis
- Verification — Execute PoCs in sandboxed environment, confirm exploitability
- Deduplication — Match against known CVEs, filter false positives
- Triage — Severity classification (CVSS-like scoring), report generation
Project-agnostic: bring your own codebase, your own model, your own CI.
Market
- TAM: $15B+ application security testing market (growing 20%+ YoY)
- Adjacent validated: Snyk ($8.5B valuation), Semgrep (OSS + commercial), CodeQL (GitHub/Microsoft)
- Gap: None of these use LLMs for hypothesis-driven discovery. They're pattern-matching or static analysis. VulnHarvest is the next generation.
- Mozilla proof point: 423 bugs in 1 month including bugs that survived decades of fuzzing — validates the approach at scale
Architecture
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ Code Ingest │───▶│ Hypothesis │───▶│ PoC Gen │
│ (AST/CFG) │ │ Generator │ │ Engine │
└─────────────┘ └──────────────┘ └─────────────┘
│
┌─────────────┐ ┌──────────────┐ ┌──────▼──────┐
│ Reporter │◀───│ Triage & │◀───│ Sandbox │
│ (SARIF) │ │ Dedup │ │ Executor │
└─────────────┘ └──────────────┘ └─────────────┘- TypeScript/Node.js CLI
- SARIF output for CI/CD integration
- Pluggable LLM backend (OpenAI, Anthropic, local models)
- Sandboxed execution (Docker-based)
- CVE database integration
Competitive Landscape
| Tool | Approach | LLM-Guided? | Open Source? | |------|----------|-------------|-------------| | Semgrep | Pattern matching | No | Yes | | CodeQL | Dataflow analysis | No | Partial | | Snyk Code | ML pattern detection | Partial | No | | Mozilla/Mythos | LLM hypothesis | Yes | No (internal) | | VulnHarvest | LLM hypothesis + PoC | Yes | Yes |
Verdict: BUILD
Rationale:
- Technical moat: Agentic harness with hypothesis→PoC→verify loop is novel in open source
- Market timing: Mozilla just proved the approach works; no open-source equivalent exists
- Brand fit: Extends phoenix-assistant security cluster (mcp-security-scanner, agent-security-scanner, etc.)
- Feasibility: MVP scope is achievable — hypothesis gen + PoC for common vulnerability classes (XSS, SQLi, path traversal, buffer overflow patterns)
- Converging signals: 3+ independent sources (Mozilla/Mythos 357 HN pts, jefftk.com 367 HN pts, Karpathy supply chain alert, Xeiaso 831 HN pts)
MVP Scope
- Code ingestion (AST parsing for JS/TS/Python/C)
- Hypothesis generation via LLM (configurable model)
- PoC test generation for top 5 vulnerability classes
- Basic sandbox execution
- SARIF output
- CLI interface:
vulnharvest scan ./src --model claude-sonnet
