comment-catcher
v0.1.10
Published
CLI tool to detect outdated code comments in PRs
Downloads
10
Maintainers
Readme
Comment Catcher
A CLI tool that detects outdated code comments in TypeScript/JavaScript projects using AI analysis.
Features
- Analyzes git diffs to find code changes
- Uses dependency graphs to find affected files beyond direct changes
- Extracts code comments using AST parsing
- Uses Claude AI to intelligently identify outdated comments
Installation
As a Development Tool (Recommended)
Install in your project:
npm install --save-dev comment-catcher
# or
yarn add -D comment-catcher
# or
pnpm add -D comment-catcherThen add to your package.json scripts:
{
"scripts": {
"check-comments": "comment-catcher check"
}
}Global Installation
npm install -g comment-catcherFrom Source
git clone https://github.com/zachicecreamcohn/comment-catcher.git
cd comment-catcher
npm install
npm run buildUsage
Prerequisites
Set your Anthropic API key:
export ANTHROPIC_API_KEY=your_api_key_hereOptional: If you're using a custom API endpoint (e.g., a proxy or alternative provider):
export ANTHROPIC_BASE_URL=https://your-custom-endpoint.comBasic Usage
# If installed as dev dependency
npm run check-comments
# If installed globally
comment-catcher check
# From source
npm start checkOptions
npm start check [options]
Options:
-b, --base <branch> Base branch to compare against (default: "main")
-d, --depth <number> Dependency graph depth to traverse (default: "3")
-o, --output <file> Output file for the report
-f, --format <format> Output format: markdown or json (default: "markdown")
--no-deps Skip dependency analysis (only check changed files)Performance Tips for Large Codebases
Comment Catcher is optimized to handle large codebases by scanning only from changed files (not the entire codebase). However, if you still encounter issues:
Reduce depth: Lower the dependency traversal depth (default is 3)
npm start check -d 1 # Only check immediate dependentsSkip dependency analysis: Use
--no-depsto only analyze changed files (not recommended, as you'll miss outdated comments in dependents)npm start check --no-depsIncrease Node.js memory: For very large codebases
NODE_OPTIONS="--max-old-space-size=8192" npm start check
Examples
# Check against main branch
npm start check
# Check against develop branch with deeper dependency analysis
npm start check -b develop -d 5
# Save report to file
npm start check -o report.md
# Generate JSON report
npm start check -f json -o report.json
# Skip dependency analysis for large codebases
npm start check --no-deps
# Reduce depth to avoid memory issues
npm start check -d 1Configuration (Optional)
You don't need a config file to get started! Comment Catcher works out of the box with sensible defaults.
Defaults
If no config file is found, Comment Catcher uses these defaults:
- Excludes:
node_modules,dist,build,.git - Extensions:
.js,.jsx,.ts,.tsx,.mjs,.cjs - Min comment length: 10 characters
- Ignored patterns:
TODO:,FIXME:,NOTE:,HACK:,@ts-*,eslint-*,prettier-* - Model:
claude-3-5-sonnet-20241022 - API endpoint:
https://api.anthropic.com(override withANTHROPIC_BASE_URLenv var) - API key: from
ANTHROPIC_API_KEYenvironment variable
Custom Configuration
To customize behavior, create a config file in one of these locations:
comment-catcher.config.json.comment-catcher.jsoncomment-catcher.config.js
Available Options
{
"excludePatterns": ["node_modules", "dist"], // Additional patterns to exclude
"extensions": [".js", ".ts", ".tsx"], // File extensions to analyze
"dependencyOptions": {
"tsConfig": "./tsconfig.json", // TypeScript config for path resolution
"webpackConfig": "./webpack.config.js" // Webpack config for alias resolution
},
"commentFilters": {
"minLength": 10, // Minimum comment length to analyze
"ignorePatterns": ["TODO:", "FIXME:"] // Comment patterns to skip
},
"llmOptions": {
"model": "claude-3-5-sonnet-20241022", // Claude model to use
"baseURL": "https://api.anthropic.com", // Custom API endpoint. Optional. You can also set ANTHROPIC_BASE_URL
}
}See comment-catcher.config.example.json for a complete example.
How It Works
- Git Diff: Identifies files changed compared to the base branch
- Dependency Graph: Uses dependency-cruiser to find related files
- Performance optimized: Scans only the changed files as entry points, not the entire codebase
- Traverses in both directions up to the specified depth:
- Upward: Files that import the changed files (dependents)
- Downward: Files that the changed files import (dependencies)
- Comment Extraction: Parses TypeScript/JavaScript files to extract all comments
- AI Analysis: Claude analyzes comments against the diff to identify outdated ones
- Report: Generates a report with reasons and suggestions for updating
Development
# Build
npm run build
# Watch mode
npm run devThis makes it suitable for use in CI/CD pipelines.
GitHub Actions Integration
Comment Catcher can automatically check PRs and post feedback as comments.
Setup
- Add the workflow file to your repository at
.github/workflows/comment-catcher.yml:
name: Comment Catcher
on:
pull_request:
types: [opened, synchronize, reopened]
jobs:
check-comments:
runs-on: ubuntu-latest
permissions:
pull-requests: write
contents: read
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0 # Fetch full history for git diff
- name: Run Comment Catcher
uses: zachicecreamcohn/comment-catcher@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
base-branch: main # Optional, defaults to 'main'
depth: 3 # Optional, defaults to 3Add your Anthropic API key as a repository secret:
- Go to your repository Settings → Secrets and variables → Actions
- Click "New repository secret"
- Name:
ANTHROPIC_API_KEY - Value: Your Anthropic API key
That's it! The action will now:
- Run on every PR (opened, updated, or reopened)
- Analyze comments for outdated documentation
- Post or update a comment on the PR with results
- Always provide feedback (even if no issues found)
Action Inputs
| Input | Description | Required | Default |
|-------|-------------|----------|---------|
| github-token | GitHub token for posting PR comments | Yes | - |
| anthropic-api-key | Anthropic API key for Claude AI | Yes | - |
| base-branch | Base branch to compare against | No | main |
| depth | Dependency graph depth to traverse | No | 3 |
| anthropic-base-url | Anthropic API base URL (for custom endpoints) | No | https://api.anthropic.com |
Custom Anthropic Endpoint
If you're using a custom Anthropic endpoint (e.g., a proxy):
- name: Run Comment Catcher
uses: zachicecreamcohn/comment-catcher@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
anthropic-base-url: https://your-anthropic-proxy.comBehavior
- ✅ Always comments - Posts feedback even when no issues found
- 🔄 Updates existing comment - Keeps PR clean by updating the same comment
- ℹ️ Informational only - Never fails the check, only provides suggestions
Advanced Configuration
To customize the analysis behavior (file patterns, comment filters, Claude model, etc.), create a comment-catcher.config.json file in your repository root. The action will automatically detect and use it.
See the Configuration section above for all available options.
Example: To exclude additional directories or change the Claude model, add this to your repo:
{
"excludePatterns": ["generated", "vendor"],
"llmOptions": {
"model": "claude-3-5-sonnet-20241022"
}
}Memory Optimization
Comment Catcher is designed to work efficiently with large codebases by using a targeted scanning approach:
- Only scans changed files as entry points - Instead of scanning your entire codebase (which could be 10,000+ files), dependency-cruiser only scans the files you actually changed
- Traverses in both directions from there - From those changed files, it finds related files up to the depth you specify (default: 3 levels):
- Files that import the changed files (dependents)
- Files that the changed files import (dependencies)
- Low memory cache - Uses a 100ms cache duration to minimize memory usage
This means if you changed 5 files in a 10,000 file codebase, it will only scan those 5 files plus their related files, not all 10,000 files.
