ideabox

v1.0.2

Published

10 days ago

Data-driven project idea engine for coding agents. Research, brainstorm, plan, and build — all from one command.

0High
0Medium
0Low

pawanpaudel93

ideabox ideas project-ideas claude-code codex-cli

IdeaBox

Data-driven project idea engine for coding agents. Researches real market demand, scores ideas on monetization and open-source impact, and orchestrates a full 9-phase pipeline from idea to shipped code. Self-improving — gets smarter with every session.

The Problem

You have Claude Code tokens but no idea what to build. Existing idea generators are either:

Static lists frozen in time (91K+ stars on app-ideas, never updated)
Thin AI wrappers that hallucinate untested concepts
Validation-only tools that require you already have an idea

No tool combines real market demand data with developer skill matching in a developer-native workflow.

The Solution

IdeaBox researches real demand signals from 6 source categories — HN, GitHub, Reddit, npm, the MCP ecosystem, and more. Every idea is backed by evidence, scored on a 60-point rubric, and matched to your skills. Then it orchestrates the entire build pipeline — brainstorm, plan, build, test, polish, and ship.

Features

| Feature | Description | |---------|-------------| | Data-grounded research | Parallel subagents search 6 source categories using structured APIs | | 60-point scoring rubric | Revenue potential, market gap, demand signal, feasibility, stack fit, trend momentum | | 9-phase pipeline | Research through shipped code, all in one command | | Visual score bars | Quick-scan summary table + per-idea score breakdowns | | Deep-interview validation | Ambiguity scoring across Problem/User/Scope/Revenue dimensions | | Head-to-head comparison | /ideabox backlog compare 1 3 for side-by-side analysis | | Acceptance criteria | Build phase only completes when ALL spec criteria pass | | Handoff documents | Structured transition docs between phases | | Self-improving | Learns preferences, adapts source weights, evolves queries | | Phase skipping | Detects existing artifacts and offers to skip completed phases | | Research history | Every run auto-saved for trend detection over time | | Resume support | Save and resume pipeline state across sessions | | Self-contained | Zero external plugin dependencies | | Cross-agent | Works with Claude Code and Codex CLI |

Installation

Claude Code Plugin (recommended)

claude plugin add pawanpaudel93/ideabox

Via npm (recommended for CLI install)

npx ideabox init

This copies skills into your project's .claude/skills/ directory with all phase files and references bundled into each skill.

From Source

git clone https://github.com/pawanpaudel93/ideabox.git
cd ideabox
pnpm install
node bin/cli.mjs init

Requirements

Claude Code or Codex CLI
Node.js >= 18
gh CLI (recommended — higher GitHub API rate limits)

Permissions

IdeaBox uses these tools during its pipeline:

| Tool | Phase | Purpose | |------|-------|---------| | WebSearch | Research | Search HN, GitHub, Reddit for demand signals | | WebFetch | Research | Hit structured APIs (HN Algolia, npm, Reddit JSON) | | Agent | Research, Build | Parallel subagents for research and implementation | | Read | All | Read files, state, research artifacts | | Write | All | Write phase outputs, state, profiles | | Edit | Build, Polish | Modify source code | | Bash | Build, QA, Ship | Run tests, git commands, CLI tools | | Glob / Grep | Build, QA | Search codebase |

Claude Code

Auto mode (recommended) — press Shift+Tab to toggle. Approves most tool calls with background safety checks. Requires Team/Enterprise/API plan with Sonnet 4.6+ or Opus 4.6+.

Accept edits mode — press Shift+Tab to select acceptEdits. Auto-approves file changes, prompts for Bash and network.

Custom permission rules — add to .claude/settings.json (project) or ~/.claude/settings.json (global):

{
  "permissions": {
    "allow": [
      "WebSearch",
      "WebFetch",
      "Glob",
      "Grep",
      "Read",
      "Edit",
      "Write",
      "Bash(git *)",
      "Bash(gh *)",
      "Bash(npm *)",
      "Bash(pnpm *)",
      "Bash(node *)"
    ]
  }
}

Note: deny rules take precedence over allow. Protected paths (.git/, .env) always prompt regardless of mode.

Codex CLI

# Recommended: workspace-write sandbox with auto-approvals
codex --full-auto "run ideabox research"

# More control: approval mode + sandbox separately
codex -a untrusted -s workspace-write "run ideabox"

# Maximum access (use with caution)
codex --yolo "run ideabox"

| Codex Flag | Claude Code Equivalent | |-----------|----------------------| | -a on-request | default mode | | -a untrusted | acceptEdits mode | | --full-auto | auto mode (sandboxed) | | --yolo | bypassPermissions mode |

Usage

/ideabox                        # Full 9-phase pipeline: research -> ship
/ideabox research               # Browse ideas without building
/ideabox backlog                # View saved ideas
/ideabox backlog compare 1 3    # Head-to-head idea comparison
/ideabox profile                # Set up, edit, or reset learning

First Run

On your first /ideabox run, you'll be asked 4 quick setup questions:

GitHub username — scans your repos to understand your tech stack
Interests — developer tools, SaaS, mobile, AI/ML, open source, desktop
Goals — monetization, open-source impact, or both
Avoid topics — optional exclusions (crypto, social media, etc.)

After setup, it goes straight to researching ideas.

Example Output

## Research Results: 5 ideas scored

| # | Idea                        | Score  | Rating      | Monetization     | Top Signal          |
|----|----------------------------|--------|-------------|------------------|---------------------|
| 1  | MCP Testing Framework      | 48/60  | Exceptional | Freemium $29/mo  | 47 upvotes Reddit   |
| 2  | AI Cost Monitor            | 42/60  | Strong      | SaaS $19/mo      | 3 HN threads        |
| 3  | Context Engine CLI         | 38/60  | Strong      | Open source      | 12 GitHub issues    |

---

### #1: MCP Testing Framework

> Developers cannot test MCP servers before production deployment.

Revenue    [=========-] 9/10
Gap        [========--] 8/10
Demand     [=========-] 9/10
Feasibility[========--] 8/10
Stack Fit  [======----] 6/10
Trend      [========--] 8/10

Evidence:
  [Reddit]  "How do you test MCP servers?" (47 upvotes)
  [HN]      "Show HN: MCP server testing" (89 points)
  [GitHub]  12 open issues requesting testing tools

How It Works

9-Phase Pipeline

/ideabox
  [1. Research]     Parallel subagents search 6 source categories
  [2. Brainstorm]   Refine chosen idea into a design spec
  [3. Plan]         Create bite-sized TDD implementation plan
  [4. Build]        Execute with subagent-driven development
  [5. QA]           Systematic testing with health score rubric
  [6. Polish]       Visual QA, AI slop detection, accessibility
  [7. Ship]         PR creation, version bump, changelog
  [8. Post-Ship]    Production monitoring, documentation sync
  [9. Learn]        Update preferences, improve future suggestions

Phases 1-4 are mandatory (research, brainstorm, plan, build)
Phases 5-8 can be skipped by user request
Phase 9 always runs automatically

Phase Gates

Each phase has a concrete gate condition:

| Phase | Gate Condition | |-------|---------------| | Research | 3+ ideas with 3+ evidence sources each | | Brainstorm | Spec written, self-reviewed, user-approved | | Plan | No placeholders, full spec coverage, type consistency | | Build | All tests pass, two-stage review (spec + quality) complete | | QA | Health score >= 70, zero critical bugs | | Polish | Before/after evidence, no AI slop detected | | Ship | PR created, CI passes, version bumped | | Post-Ship | Canary health check passes or skipped | | Learn | Preferences and self-improvement data updated |

Iron Laws

Four non-negotiable rules enforced throughout the pipeline:

No code before design approval — brainstorm phase must complete first
No production code without a failing test — TDD in the build phase
No fixes without root cause investigation — systematic debugging
No completion claims without verification evidence — run the command, read the output, then claim

Scoring

Every idea must pass a hard filter: real monetization potential OR high open-source impact.

6 Dimensions (max 60)

| Dimension | What it measures | Bonus | |-----------|-----------------|-------| | Revenue potential | Willingness to pay, market size, pricing precedent | | | Market gap | Number and quality of existing competitors | | | Demand signal | Evidence from HN, Reddit, GitHub | +2/+3 cross-source | | Feasibility | Can a solo dev build an MVP with AI? | | | Stack fit | Match to your known tech stack | Bonus, not blocker | | Trend momentum | Is this space growing? | +2 agentic AI |

Score Interpretation

| Score | Rating | Action | |-------|--------|--------| | 45-60 | Exceptional | Build this now | | 35-44 | Strong | Worth serious consideration | | 25-34 | Decent | Consider if it matches interests | | < 25 | Filtered | Not shown |

Data Sources

All research uses token-efficient structured APIs (not generic web scraping):

| Category | What We Search | API | |----------|---------------|-----| | Agentic AI (priority) | Agent frameworks, MCP servers, Claude Code plugins | HN Algolia, GitHub Search | | Developer Pain Points | Reddit complaints, HN Ask threads | Reddit JSON, HN Algolia | | Trending Projects | GitHub Trending, Show HN | GitHub Search API | | Indie Hacker Signals | r/SideProject, r/indiehackers revenue stories | Reddit JSON | | Package Ecosystem | npm trending, plugin marketplace gaps | npm Registry API | | Your GitHub Profile | Your repos, languages, project types | GitHub REST API |

Self-Improvement

IdeaBox gets smarter the more you use it through 4 auto-iterating loops:

1. Preference Learning (immediate)

Tracks which ideas you accept, dismiss, start, complete, or abandon. Updates category/complexity/monetization scores. Anti-echo-chamber: always reserves 10% for diverse suggestions.

2. Source Quality Tracking (after 5 sessions)

Tracks which research sources contribute to ideas you actually choose. High-performing sources get expanded queries; low-performing sources get reduced. Scores clamped to [0.1, 1.0] — nothing fully eliminated.

3. Scoring Weight Adaptation (after 10 outcomes)

Tracks which scoring dimensions predict ideas that actually ship. Auto-adjusts weights (e.g., demand signal matters more than feasibility for you). Clamped to [0.5x, 1.5x] to prevent over-fitting.

4. Query Evolution (after 3 uses per query)

Tracks which search queries return useful results. Productive queries spawn variations; dead queries retire. Retired queries archived for trend analysis.

Reset: /ideabox profile -> choose "reset learning"

Persistence

All data stored locally — nothing leaves your machine:

| File | Location | Purpose | |------|----------|---------| | profile.json | ~/.ideabox/ | Interests, stacks, preference scores | | ideas.jsonl | ~/.ideabox/ | All ideas with scores and status (append-only) | | sessions.jsonl | ~/.ideabox/ | Research session logs | | preferences.jsonl | ~/.ideabox/ | Implicit feedback events | | source-quality.jsonl | ~/.ideabox/ | Source quality tracking | | scoring-feedback.jsonl | ~/.ideabox/ | Scoring weight adaptation | | query-performance.jsonl | ~/.ideabox/ | Query evolution data | | research/ | ~/.ideabox/ | Auto-saved research history | | state.json | .ideabox/ (per-project) | Pipeline state for resume |

Architecture

Self-contained Claude Code plugin. Zero external dependencies.

ideabox/
├── .claude-plugin/
│   └── plugin.json              # Plugin manifest (name: "ideabox")
├── bin/
│   └── cli.mjs                  # CLI: init, uninstall, doctor
├── skills/
│   ├── ideabox/                 # Main skill (/ideabox)
│   │   ├── SKILL.md             # Pipeline router (~3KB)
│   │   ├── phases/
│   │   │   ├── 01-research.md
│   │   │   ├── 02-brainstorm.md
│   │   │   ├── 03-plan.md
│   │   │   ├── 04-build.md
│   │   │   ├── 05-qa.md
│   │   │   ├── 06-polish.md
│   │   │   ├── 07-ship.md
│   │   │   ├── 08-post-ship.md
│   │   │   └── 09-learn.md
│   │   └── references/
│   │       ├── research-sources.md
│   │       ├── scoring-rubric.md
│   │       ├── revenue-models.md
│   │       └── self-improvement.md
│   ├── research/SKILL.md        # Browse-only mode
│   ├── backlog/SKILL.md         # Idea management + compare
│   └── profile/SKILL.md         # Profile setup + reset learning
├── CLAUDE.md                    # Agent rules (Claude Code)
├── AGENTS.md                    # Agent rules (Codex CLI)
├── package.json
├── LICENSE
└── README.md

CLI

npx ideabox init        # Install skills to current project
npx ideabox uninstall   # Remove skills from current project
npx ideabox doctor      # Run health checks

The doctor command verifies:

Node.js >= 18
Data directory (~/.ideabox/)
Profile configured
Skills installed (checks Claude Code and Codex paths)
CLI availability

Contributing

Contributions welcome! Please:

Fork the repository
Create a feature branch (git checkout -b feat/my-feature)
Make your changes
Run node bin/cli.mjs doctor to verify
Submit a pull request

Acknowledgements

IdeaBox's pipeline design was inspired by patterns and approaches from these projects:

| Project | Inspiration | |---------|-------------| | superpowers | Brainstorming workflow, TDD discipline, writing-plans, subagent-driven development, verification-before-completion, systematic debugging | | gstack | QA health score rubric, design review with AI slop detection, ship workflow, canary monitoring, office-hours forcing questions | | oh-my-claudecode | Deep-interview validation, artifact-based phase skipping, autopilot phase chaining | | last30days-skill | Research stats block, auto-save research history, comparison mode | | Homunculus | Implicit feedback tracking, instinct-based self-improvement | | Spec-Flow | Phase-gate pipeline architecture, progressive disclosure pattern | | planning-with-files | "Context = RAM, Filesystem = Disk" principle |

Thank you to these projects and their maintainers for sharing their work openly.

License

MIT - Pawan Paudel