pr-splitter
v0.0.3
Published
CLI that splits one large pull request into many small pull requests
Readme
pr-splitter
A CLI tool that uses AI to split large pull requests into smaller, semantically meaningful ones using vector search and hunk-level staging.
How it works
- Build a vector database of the diff, one entry per hunk
- Provide some context and tools to the pr summary planner
- It outputs a plan
- Plan is turned into a check list which can be updated with things like idle, in_progress, tests_passed, tests_failed, finished, read_plan
- PR split is given access to all the tools: search_hunks, grep_hunks, update_plan, read_plan, run_tests, apply_hunks, (...other git operations)
- Validate the result: diff equality, all tests passed
- Ask to push branches
- Ask to create pull requests
Installation
# Install dependencies
pnpm install
# Build the project
pnpm run buildUsage
Development Mode
pnpm run devProduction Mode
# Build first
pnpm run build
# Run the compiled version
pnpm startUsing the CLI
- Run the command
- Enter a GitHub pull request URL when prompted (e.g.,
https://github.com/owner/repo/pull/123) - The tool will:
- Clone the repository to a temporary directory
- Fetch the pull request branch
- Check out the PR head branch
- Display detailed commit information
- Clean up temporary files
Example
$ pnpm run dev
PR Splitter CLI
Split large pull requests into smaller, manageable ones
? Enter the pull request URL: https://github.com/facebook/react/pull/12345
Parsed PR: facebook/react#12345
Cloning repository facebook/react...
Repository cloned successfully
Fetching pull request...
Pull request data fetched
Checking out PR #12345...
Checked out PR #12345
Getting commit details...
Commit Details:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Commit Hash: abc123def456...
Author: John Doe <[email protected]>
Date: Wed Jan 10 2024 14:30:00 GMT-0800 (PST)
Message: Add new feature for splitting PRs
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
File Changes:
src/index.ts | 150 +++++++++++++++++++++++++++
package.json | 12 +++
README.md | 45 ++++++++
3 files changed, 207 insertions(+)
Cleaned up temporary directory
Process completed successfully!Requirements
- Node.js 18+
- pnpm package manager
- Git installed and available in PATH
Todo
- [x] Ask for confirmation to create the pull requests
- [x] Ask for confirmation to close the original pull request
- [x] Show progress of grouping
- [x] Progress of grouping should be correct width
- [x] Remove press any key to cancel
- [x] Fix bug where it keeps "Grouped 3 of 4 hunks" around after its done
- [x] Cloning should happen in a real tmp directory
- [x] Remove all the AI cruft
- [x] Reset the state whenever we do a split pull request, not when we finish one
- [x] Make the refining groups screen better
- [x] Unify the progress bar so that it shows for the entire split process
- [x] Give a satisfying little thing after the split is complete
- [ ] Give AI the grep hunks tool
- [x] PR confirmation should allow user to see the diffs in each group
- [ ] Update this readme to not be AI generated copying a couple good examples
- [ ] publish the thing to npm
Architecture
Vector Database Approach
The tool creates a semantic understanding of code changes by:
- Parsing diffs into hunks: Each code change is broken down into atomic units
- Generating embeddings: Vector representations capture semantic meaning of each hunk
- Semantic search: LLM can search for related hunks across the entire diff
- Intelligent grouping: Changes are grouped by logical relationships, not just file proximity
Tools Available to AI
search_hunks: Find semantically related code changes using vector searchgrep_hunks: Search hunks by text patterns or keywordsupdate_plan: Modify the splitting plan based on discovered relationshipsread_plan: Access current plan and progress statusrun_tests: Validate each split passes tests before proceedingapply_hunks: Stage specific hunks for atomic commits- Standard git operations: branch creation, commits, resets, etc.
Plan Management
The splitting process uses a dynamic checklist system:
- idle: Task not started
- in_progress: Currently being worked on
- tests_passed: Implementation complete and validated
- tests_failed: Needs revision
- finished: Ready for PR creation
Validation & Safety
- Diff equality: Ensures all changes are preserved across splits
- Test validation: Each split must pass the project's test suite
- Incremental approach: Builds commits progressively to maintain working state
- User confirmation: Asks permission before pushing branches or creating PRs
Current Features
- AI-powered diff analysis with retry logic and rate limiting
- File categorization (core, dependencies, tests, docs, config, generated, styles)
- Repository configuration management per PR URL
- Hunk-level staging design (see
/docs/hunk-staging-design.md) - Model tier switching (premium/standard) for cost optimization
