codemodctl
v0.1.28
Published
CLI tool and utilities for workflow engine operations, file sharding, and codeowner analysis
Readme
codemodctl
CLI tool and utilities for workflow engine operations, file sharding, and codeowner analysis.
Installation
npm install codemodctlUsage
As a CLI Tool
# Analyze CODEOWNERS and generate sharding configuration
codemodctl codeowner --shard-size 20 --state-prop shards --rule ./rule.yamlAs a Library
Deterministic File Sharding
import { getShardForFilename, fitsInShard, distributeFilesAcrossShards } from 'codemodctl/sharding';
// Get the shard index for a specific file - always deterministic!
const shardIndex = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
// Same file + same shard count = same result, every time
const shard1 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
const shard2 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
console.log(shard1 === shard2); // always true
// Check if a file belongs to a specific shard
const belongsToShard = fitsInShard('src/components/Button.tsx', {
shardCount: 5,
shardIndex: 2
});
// Distribute all files across shards with consistent hashing
const files = ['file1.ts', 'file2.ts', 'file3.ts'];
const distribution = distributeFilesAcrossShards(files, 5);
// Check scaling behavior - minimal reassignment when growing
const scalingAnalysis = analyzeShardScaling(files, 5, 6);
console.log(`${scalingAnalysis.stableFiles} files stay in same shard`);
console.log(`${scalingAnalysis.reassignmentPercentage}% reassignment`); // Much less than 100%Codeowner Analysis
import { analyzeCodeowners, findCodeownersFile } from 'codemodctl/codeowners';
// Analyze codeowners and generate shard configuration
const result = await analyzeCodeowners({
shardSize: 20,
rulePath: './rule.yaml',
projectRoot: process.cwd()
});
console.log(`Generated ${result.shards.length} shards for ${result.totalFiles} files`);
result.teams.forEach(team => {
console.log(`Team "${team.team}" owns ${team.fileCount} files`);
});Complete API
import codemodctl from 'codemodctl';
// Access all utilities through the default export
const shardIndex = await codemodctl.sharding.getShardForFilename('file.ts', { shardCount: 5 });
const analysis = await codemodctl.codeowners.analyzeCodeowners(options);Key Features
Consistent File Sharding
The sharding algorithm uses consistent hashing to ensure:
- Perfect consistency: Same file + same shard count = same result, always
- No external dependencies: Result depends only on filename and shard count
- Minimal reassignment: When scaling up, only ~20-40% of files move (not 100%)
- Stable scaling: Adding new shards doesn't reorganize existing file assignments
- Simple API: No complex parameters or configuration needed
- Team-aware sharding: Works with codeowner boundaries
Codeowner Analysis
- Automatic CODEOWNERS detection: Searches common locations (root, .github/, docs/)
- AST-grep integration: Analyze files using custom rules
- Team-based grouping: Groups files by their assigned teams
- Shard generation: Creates optimal shard configuration based on team ownership
API Reference
Sharding Functions
getShardForFilename(filename, { shardCount })- Get shard index for a filefitsInShard(filename, { shardCount, shardIndex })- Check shard membershipdistributeFilesAcrossShards(files, shardCount)- Distribute files across shardscalculateOptimalShardCount(totalFiles, targetShardSize)- Calculate optimal shard countgetFileHashPosition(filename)- Get consistent hash position for a fileanalyzeShardScaling(files, oldCount, newCount)- Analyze reassignment when scaling
All functions are deterministic: same input always produces the same output.
Scaling behavior: When going from N to N+1 shards, typically only 20-40% of files get reassigned to new locations, making it ideal for incremental scaling scenarios.
Codeowner Functions
analyzeCodeowners(options)- Complete analysis with shard generationfindCodeownersFile(projectRoot?, explicitPath?)- Locate CODEOWNERS fileloadAstGrepRule(rulePath)- Parse AST-grep rule from YAMLanalyzeFilesByOwner(codeownersPath, rule, projectRoot?)- Group files by ownergenerateShards(filesByOwner, shardSize)- Generate shard configurationnormalizeOwnerName(owner)- Normalize owner names
Usage Examples
Simple Deterministic Sharding
import { getShardForFilename, distributeFilesAcrossShards } from 'codemodctl/sharding';
// Get shard for a file - always deterministic
const shard = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
// Same input always gives same output
const shard1 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
const shard2 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
console.log(shard1 === shard2); // always true
// Different shard counts give different results (expected behavior)
const shard5 = getShardForFilename('src/components/Button.tsx', { shardCount: 5 });
const shard10 = getShardForFilename('src/components/Button.tsx', { shardCount: 10 });
// shard5 and shard10 will likely be different, but each is consistent
// Distribute files with consistent hashing for stable scaling
const files = ['file1.ts', 'file2.ts', 'file3.ts'];
const distribution = distributeFilesAcrossShards(files, 5);
// When you need more capacity, most files stay in place
const moreFiles = [...files, 'newFile.ts'];
const analysis = analyzeShardScaling(files, 5, 6);
// Only ~20-40% of files get reassigned, not all of them!Key Benefits
- No complex parameters: Just filename and shard count
- Perfectly deterministic: Same input = same output, always
- Stable scaling: When adding shards, most files stay in their original shards
- Minimal reassignment: Only ~20-40% of files move when scaling up
- Fast and simple: Hash-based assignment with consistent ring placement
- Works across runs: File gets same shard whether filesystem changes or not
CLI Commands
codeowner
Analyze CODEOWNERS file and generate sharding configuration.
codemodctl codeowner [options]
Options:
-s, --shard-size <size> Number of files per shard (required)
-p, --state-prop <prop> Property name for state output (required)
-c, --codeowners <path> Path to CODEOWNERS file (optional)
-r, --rule <path> Path to AST-grep rule file (required)Environment variables:
STATE_OUTPUTS: Path to write state output file
License
MIT
