@getcodesentinel/codesentinel
v1.17.4
Published
Structural and evolutionary risk analysis engine for modern TypeScript/JavaScript codebases.
Downloads
3,947
Readme
CodeSentinel is a structural and evolutionary risk analysis engine for modern TypeScript/JavaScript codebases. It turns architecture, change history, and dependency health into a unified risk model that helps engineering teams spot fragility before it becomes failure.
This repository contains the CodeSentinel monorepo, with structural, evolution, external dependency, and deterministic risk analysis engines exposed through a CLI.
Quick Start
1) Install
Global install (good for local/manual usage):
npm install -g @getcodesentinel/codesentinelProject/CI install (recommended for CI and reproducibility):
npm install --save-dev @getcodesentinel/codesentinel2) Analyze a project
Run commands from the repository root you want to analyze. If you run from another directory, pass the target path explicitly (for example codesentinel analyze /path/to/project).
codesentinel analyze3) Generate a report
codesentinel report --format md --output codesentinel-report.md4) Run in CI
Using local project install (no global install required):
npx codesentinel ci --baseline-ref auto --fail-on errorOr in package scripts:
{
"scripts": {
"risk:ci": "codesentinel ci --baseline-ref auto --fail-on error"
}
}CI example:
- uses: actions/checkout@v4
with:
fetch-depth: 0
filter: blob:none
ref: ${{ github.event.pull_request.head.sha || github.sha }}
- name: Ensure git history for CodeSentinel
run: |
set -euo pipefail
git fetch --prune --unshallow || true
BASE_REF="${GITHUB_BASE_REF:-main}"
git fetch origin "+refs/heads/${BASE_REF}:refs/remotes/origin/${BASE_REF}"
- name: Run CodeSentinel
run: npx codesentinel ci --baseline-ref auto --max-risk-score 55 --max-risk-delta 0.03 --min-health-score 65 --max-health-delta 0.03 --no-new-cycles --no-new-high-risk-deps --max-new-hotspots 2 --fail-on error--baseline-ref auto requires enough git history to resolve a baseline deterministically. In GitHub Actions, use fetch-depth: 0 and ensure the CI base branch ref is fetched.
A full workflow template is available at examples/github-actions/codesentinel-ci.yml.
Vision
CodeSentinel combines three signals into a single, explainable risk profile:
- Structural risk: dependency graph topology, cycles, coupling, fan-in/fan-out, boundary violations.
- Evolutionary risk: change frequency, hotspots, bus factor, volatility.
- External risk: transitive dependency exposure, maintainer risk, staleness and abandonment indicators.
- Includes bounded popularity dampening (weekly npm downloads) as a secondary stability signal.
The CLI output now includes a deterministic risk block composed from those dimensions:
riskScoreandnormalizedScore- ranked
hotspots fragileClusters(structural cycles + change coupling components)dependencyAmplificationZones- file/module/dependency score tables
It also includes a deterministic health block (healthScore, dimension scores, and actionable top issues) computed independently from risk.
The goal is a practical, engineering-grade model that supports both strategic architecture decisions and daily code review workflows.
Monorepo Layout
packages/core: shared domain types and cross-cutting services.packages/code-graph: source graph analysis primitives.packages/git-analyzer: Git history and evolutionary signals.packages/dependency-firewall: external dependency and supply chain signals.packages/risk-engine: risk aggregation and scoring model.packages/health-engine: health posture aggregation and scoring model.packages/reporter: structured report output (console, JSON, CI).packages/governance: CI gate evaluation and enforcement policy checks.packages/cli: user-facing CLI entrypoint.
Each package is standalone, ESM-only, TypeScript-first, and built with tsup. The CLI depends on core; domain packages are kept decoupled to avoid circular dependencies.
Requirements
- Node.js 22+
- pnpm
Commands
pnpm installpnpm buildpnpm devpnpm testpnpm release
CLI
Install globally with npm:
npm install -g @getcodesentinel/codesentinelThen run:
codesentinel analyze [path]
codesentinel run [path]
codesentinel explain [path]
codesentinel report [path]
codesentinel check [path]
codesentinel ci [path]
codesentinel dependency-risk <dependency[@version]>Examples:
codesentinel run
codesentinel run . --detail full --format text
codesentinel analyze
codesentinel analyze .
codesentinel analyze ../project
codesentinel explain
codesentinel explain . --top 5 --format text
codesentinel explain . --file src/app/page.tsx
codesentinel explain . --module src/components
codesentinel report
codesentinel report --format md --output report.md
codesentinel report --snapshot snapshot.json
codesentinel report --compare baseline.json --format text
codesentinel check --compare baseline.json --max-risk-delta 0.03 --no-new-cycles
codesentinel ci --baseline baseline.json --snapshot current.json --report report.md --fail-on error
codesentinel ci --baseline-ref origin/main --max-risk-delta 0.03 --no-new-cycles
codesentinel ci --baseline-ref auto --fail-on error
codesentinel dependency-risk react
codesentinel dependency-risk [email protected]Author identity mode:
# Default: heuristic merge of likely same person across emails
codesentinel analyze . --author-identity likely_merge
# Deterministic: strict email identity, no heuristic merging
codesentinel analyze . --author-identity strict_email
# Personal-project profile (down-weights single-maintainer ownership penalties)
codesentinel analyze . --scoring-profile personal
# Tune recency window (days) used for evolution volatility
codesentinel analyze . --recent-window-days 60
# Quiet mode (only JSON output)
codesentinel analyze . --log-level silent
# Verbose diagnostics to stderr
codesentinel analyze . --log-level debug
# Default compact output (summary)
codesentinel analyze .
# Full output (all sections and detailed arrays)
codesentinel analyze . --output json
codesentinel analyze . --json
# Explain top hotspots with narrative output
codesentinel explain .
# Explain a specific file
codesentinel explain . --file src/app/page.tsx
# Explain a specific module
codesentinel explain . --module src/components
# Explain in markdown or json
codesentinel explain . --format md
codesentinel explain . --format json
codesentinel explain . --recent-window-days 60
# Report generation (human + machine readable)
codesentinel report .
codesentinel report . --format md --output report.md
codesentinel report . --format json
codesentinel report . --snapshot snapshot.json
codesentinel report . --compare baseline.json --format textNotes:
likely_merge(default) may merge multiple emails that likely belong to the same person based on repository history.strict_emailtreats each canonical email as a distinct author, which avoids false merges but can split the same person across multiple emails.- Git mailmap is enabled (
git log --use-mailmap). Put.mailmapin the repository being analyzed (thecodesentinel analyze [path]target). Git will then deterministically unify known aliases before CodeSentinel computesauthorDistribution. authorDistributionreturns whichever identity mode is selected.- Logs are emitted to
stderrand JSON output is written tostdout, so CI redirection still works. - You can set a default log level with
CODESENTINEL_LOG_LEVEL(silent|error|warn|info|debug). - At
info/debug, structural, evolution, and dependency stages report progress so long analyses are observable. --output summary(default) prints a compact result for terminal use.--output json(or--json) prints the full analysis object.--recent-window-days <days>customizes the git recency window used to computerecentVolatility(default:30).--scoring-profile default|personalselects scoring profile.default: balanced team-oriented defaults.personal: lowers single-maintainer ownership penalties for both risk and health ownership scoring.personaldoes not remove structural, churn, volatility, external, or interaction risk; scores can still be elevated when those signals are high.
When running through pnpm, pass CLI arguments after --:
pnpm dev -- analyze
pnpm dev -- analyze .
pnpm dev -- analyze ../project
pnpm dev -- analyze . --author-identity strict_email
pnpm dev -- run . --format text
pnpm dev -- explain
pnpm dev -- explain . --top 5 --format text
pnpm dev -- explain . --file src/app/page.tsx
pnpm dev -- report
pnpm dev -- report . --format md --output report.md
pnpm dev -- report . --compare baseline.json --format text
pnpm dev -- check . --compare baseline.json --max-risk-delta 0.03 --no-new-cycles
pnpm dev -- ci . --baseline baseline.json --snapshot current.json --report report.md --fail-on warnReport Output
codesentinel report produces deterministic engineering artifacts from existing analysis outputs.
- default format:
md - formats:
text,md,json - optional file output:
--output <path> - optional snapshot export:
--snapshot <path> - optional diff mode:
--compare <baseline.json>
Diff mode compares snapshots and reports:
- repository score deltas
- file/module risk deltas
- new/resolved hotspots
- new/resolved cycles
- dependency exposure list changes
Run Output
codesentinel run is a convenience command that emits analyze + explain + report in one execution.
- formats:
text,md,json(mddefault) - detail levels:
--detail compact|standard|full(compactdefault,full= full verbose sections) - explain target selectors:
--file <path>,--module <name>,--top <n> - report diff/snapshot flags:
--compare <baseline.json>,--snapshot <path>,--no-trace
CI Mode
codesentinel check evaluates enforcement gates against current analysis (and optional baseline diff).
Supported gates:
--max-risk-delta <value>--max-health-delta <value>--no-new-cycles--no-new-high-risk-deps--max-new-hotspots <count>--max-risk-score <score>--min-health-score <score>--new-hotspot-score-threshold <score>--fail-on error|warn
codesentinel ci orchestrates snapshot + diff + gate evaluation + markdown summary generation.
Baseline input modes:
--baseline <path>: use an existing snapshot JSON artifact.--baseline-ref <git-ref>: resolve baseline from git (origin/main,main,HEAD~1) using an isolated temporary worktree. No checkout/stash of your current working tree.--baseline-ref auto: deterministic resolver with ordered fallbacks:--baseline-sha <sha>if provided.- CI base branch env vars (
GITHUB_BASE_REF,CI_MERGE_REQUEST_TARGET_BRANCH_NAME,BITBUCKET_PR_DESTINATION_BRANCH) withorigin/<branch>first, then<branch>. - branch-aware defaults:
- on
main/master:HEAD~1 - otherwise:
merge-base(HEAD, origin/main)thenorigin/master,main,master
- on
--main-branch <name>(repeatable) or--main-branches "main,master,trunk"customize default branch candidates used by--baseline-ref auto.
GitHub Actions recommendation for deterministic CI with --baseline-ref auto:
- uses: actions/checkout@v4
with:
fetch-depth: 0
filter: blob:none
ref: ${{ github.event.pull_request.head.sha || github.sha }}
- name: Ensure git history for CodeSentinel
run: |
set -euo pipefail
git fetch --prune --unshallow || true
BASE_REF="${GITHUB_BASE_REF:-main}"
git fetch origin "+refs/heads/${BASE_REF}:refs/remotes/origin/${BASE_REF}"Exit codes:
0: no failing violations1: error-level violations2: warn-level violations when--fail-on=warn3: invalid configuration4: internal execution error
Explain Output
codesentinel explain uses the same risk-engine scoring model as analyze and adds structured explanation traces.
Text/markdown output includes:
- repository score and risk band (
low|moderate|elevated|high|very_high) - repository dimension scores (
structural,evolution,external,interactions) as0-100 - plain-language primary drivers
- concrete evidence values behind those drivers
- intersected signals (composite interaction terms)
- prioritized reduction actions
- per-target breakdowns (repository/file/module/dependency, depending on selection)
Filters:
--file <path>: explain one file target.--module <name>: explain one module target.--top <n>: explain topnhotspot files (default behavior when no file/module is provided).--format text|json|md: render narrative text, full JSON payload, or markdown.
Understanding Analyze Output
codesentinel analyze returns one JSON document with five top-level blocks:
structural: file dependency graph shape and graph metrics.evolution: git-derived change behavior per file and coupling pairs.external: dependency exposure for direct packages plus propagated transitive signals.risk: deterministic composition ofstructural + evolution + external.health: deterministic code health posture from local structural/evolution/test signals.
Minimal shape:
{
"structural": { "...": "..." },
"evolution": { "...": "..." },
"external": { "...": "..." },
"risk": {
"riskScore": 0,
"normalizedScore": 0,
"hotspots": [],
"fragileClusters": [],
"dependencyAmplificationZones": []
},
"health": {
"healthScore": 0,
"normalizedScore": 0,
"dimensions": {
"modularity": 0,
"changeHygiene": 0,
"testHealth": 0,
"ownershipDistribution": 0
},
"topIssues": [],
"trace": {
"schemaVersion": "1",
"dimensions": []
}
}
}How to read risk first:
riskScore: overall repository fragility index (0..100).hotspots: ranked files to inspect first.fragileClusters: groups of files with structural-cycle or co-change fragility.dependencyAmplificationZones: files where external dependency pressure intersects with local fragility.
Score direction:
risk.riskScore: higher means higher risk (worse).health.healthScore: higher means better health posture.- Report views also include derived tiers:
riskTierandhealthTier. health.trace: per-dimension factor traces with normalized metrics and evidence.
Health dimensions:
modularity: cycle density + fan/centrality concentration + structural-hotspot overlap.changeHygiene: churn/volatility concentration + dense co-change clusters.testHealth: test presence + test-to-source ratio + testing directory presence.ownershipDistribution: top-author share + author entropy + single-author dominance signals.
Signal ingestion (deterministic, local):
- Structural and evolution metrics come from local graph + git analysis.
- Test posture uses file/path heuristics only (no mandatory coverage integration).
- Ownership posture is derived from local git author distribution metrics.
Interpretation notes:
- Scores are deterministic for the same inputs and config.
- Scores are meant for within-repo prioritization and trend tracking.
- Full model details and limits are in
packages/risk-engine/README.md.
Risk Score Guide
Use these ranges as operational guidance:
0-20: low fragility (architectural and change pressure signals are generally contained).20-40: moderate fragility (localized hotspots exist; monitor trend direction and concentration).40-60: elevated fragility (prioritize top hotspots before introducing major concurrent change).60-80: high fragility (expect higher coordination cost, regressions, and change coupling across areas).80-100: very high fragility (treat as immediate triage; focus on stabilization before further expansion).
These ranges are heuristics for triage, not incident probability.
Health Score Guide
Use these ranges as operational guidance:
0-20: critical health posture (maintainability pressure is highly concentrated and debt is likely compounding).20-40: weak health posture (key maintainability bottlenecks are visible; prioritize stabilization work).40-60: fair health posture (baseline is workable, but concentrated architecture/change pressure can still slow delivery).60-80: good health posture (most maintainability signals are stable, with targeted improvements still valuable).80-100: excellent health posture (maintainability pressure is broadly distributed and sustainably controlled over time).
These ranges are heuristics for prioritization, not absolute quality guarantees.
What Moves Scores
risk.riskScore and risk.fileScores[*].score increase when:
- structurally central files/modules change frequently,
- ownership is highly concentrated in volatile files,
- files in central areas are exposed to high external dependency pressure,
- tightly coupled change patterns emerge.
They decrease when:
- change concentrates less around central files,
- ownership spreads or volatility decreases,
- dependency pressure decreases (shallower trees, fewer high-risk signals),
- hotspot concentration drops.
External Risk Signal Semantics
For external.dependencies, each direct dependency now exposes three signal fields:
ownRiskSignals: signals computed from that package itself.inheritedRiskSignals: signals propagated from transitive dependencies in its subtree.riskSignals: union ofownRiskSignalsandinheritedRiskSignals.
Data source notes:
- Lockfile-first extraction supports
pnpm-lock.yaml,package-lock.json/npm-shrinkwrap.json,yarn.lock, andbun.lock. - If no lockfile is present, CodeSentinel attempts a bounded npm registry graph resolution from direct dependencies.
- npm weekly download metadata is fetched only for direct dependencies (not all transitive nodes).
Classification lists:
highRiskDependencies: production direct packages classified from strong own signals (not inherited-only signals).highRiskDevelopmentDependencies: same classification model for direct development dependencies.transitiveExposureDependencies: direct packages carrying inherited transitive exposure signals.
Current high-risk rule for direct dependencies:
- mark high-risk if own signals include
abandoned, or - mark high-risk if at least two of own signals are in
{high_centrality, deep_chain, high_fanout}, or - mark high-risk if own signals include
single_maintainerand the package is stale (>= half abandoned threshold) or has no recent repository activity signal.
Propagation policy is explicit and deterministic:
single_maintainer: not propagated- Rationale: maintainer concentration is package-specific governance, not a transferable property.
abandoned: propagated- Rationale: depending on abandoned transitive packages is still real operational exposure.
- Note:
abandonedDependencieslist only includes packages with ownabandoned.
high_centrality: propagated- Rationale: highly central transitive packages can become systemic weak points for a parent dependency.
deep_chain: propagated- Rationale: deep transitive trees increase update/debug complexity for top-level dependencies.
high_fanout: propagated- Rationale: broad transitive fan-out increases blast radius and maintenance surface.
metadata_unavailable: not propagated- Rationale: unknown metadata for one child should not automatically degrade parent classification.
This keeps package-level facts local while still surfacing meaningful transitive exposure.
ESM Import Policy
- The workspace uses
TypeScriptwithmoduleResolution: "NodeNext"and ESM output. - For local relative imports, use
.jsspecifiers in source files (example:import { x } from "./x.js"). - Do not use
.tsspecifiers for runtime imports in package source files. - This keeps emitted code and runtime resolution aligned with Node.js ESM behavior.
License
MIT
