arch-score
v0.1.3
Published
Language-agnostic CLI that scores how well any project follows modern system-design standards, recommends folder structures, and emits AI-assistant guidance files.
Maintainers
Readme
arch-score
A language-agnostic CLI that scores how well any project follows modern system-design standards, recommends the best folder structure, and emits guidance files that make AI coding assistants follow good system design.
arch-score grades its own repo with the badge above, generated by its own GitHub Action — see CI & badge.
arch-score is a heuristic advisor, not a judge. It gives you a score (0–100), explains why points were lost with file-level references, and hands you prioritized, concrete fixes. It works on frontend or backend code in any language, runs fully offline, and has zero paid dependencies.
Example report — a backend service (TypeScript · express, deep tier, 41 modules):
🟡 Overall 72 / 100 · grade C
| | Category | Score | Weight | |:--:|:--|--:|--:| | 🟢 | Architecture & Layering | 100 | 19 | | 🟢 | Modularity & Coupling ◆ | 80 | 11 | | 🟢 | Folder Structure | 88 | 15 | | 🔴 | Testing Architecture | 45 | 11 | | 🟡 | Containerization | 70 | 6 | | ⚪ | …other categories | | |
Legend 🟢 ≥ 80 (healthy) · 🟡 60–79 (needs work) · 🔴 < 60 (at risk) · ◆ deep-tier (import-graph) analysis
Weights are normalized across the categories that apply to your project — a backend includes Containerization (shown above); a CLI or library re-weights it out, and the remaining categories total 100 on their own. In your terminal these scores and bars are rendered in live ANSI color; the table above is the color-coded equivalent for npm/GitHub.
Install
# Run without installing
npx arch-score .
# Or install globally
npm install -g arch-score
archscore .Requires Node.js ≥ 18.
Quick start
archscore . # pretty terminal report
archscore ./service --ci # exit non-zero if below threshold (CI gate)
archscore . --json > report.json # machine-readable
archscore . --html # writes archscore-report.html
archscore . --emit-md # writes SYSTEM_DESIGN.md playbook
archscore . --emit-skill --format claude # writes CLAUDE.md for AI assistantsHow it works: tiered analysis
arch-score auto-detects languages, frameworks, and project type from manifests
(package.json, pyproject.toml, go.mod, pom.xml, Cargo.toml, composer.json, Gemfile, …) and file extensions, then runs two tiers:
- Universal tier — works for every language. Folder/architecture pattern detection, config-as-env vs hardcoded, test presence & ratio, a CI-runs-tests check, docs, containerization (Dockerfile/compose) for services, observability config, lockfile & dependency-pinning checks, secret-leakage heuristics, and file/module size outliers.
- Deep tier — optional per-language plugins that build a real import/dependency graph for circular-dependency, fan-in/fan-out, and graph-depth analysis. Ships with adapters for JavaScript/TypeScript, Python, and Go. For unsupported languages it degrades gracefully to universal-tier scoring and tells you which tier ran.
The report header always states the tier used, and any category that can't be fairly assessed is re-weighted out (its weight is redistributed) rather than scored zero — so an unsupported language is never silently penalized.
The rubric
Each category is scored 0–100 against a transparent rubric (start at 100, lose points per finding), then combined into a weighted overall score. The default profile is structure-first. Weights are relative: arch-score normalizes them across the categories that apply to your project, so the effective weights always total 100 for a given project.
| Category | Weight | Tier | What it rewards |
| --- | ---: | --- | --- |
| Architecture & Layering | 20 | Universal | A recognizable pattern, thin entry points, no god-folders |
| Folder Structure | 16 | Universal | Layout matches a convention for the detected project type; sane depth |
| Modularity & Coupling | 12 | Deep* | No circular deps, no fan-out/fan-in outliers, shallow graph |
| Testing Architecture | 12 | Universal | Test presence, healthy test-to-source ratio, integration layer, CI-runs-tests |
| Config & 12-Factor | 10 | Universal | Env-based config, .env.example, no committed .env, no hardcoded endpoints |
| Error Handling & Resilience | 8 | Universal | No swallowed errors, a central handler, timeouts/retries for services |
| Security Hygiene | 8 | Universal | No leaked secrets, .gitignore covers env, lockfile committed |
| Observability | 7 | Universal | Structured logging, metrics/tracing deps, health endpoints |
| Documentation | 7 | Universal | A substantial README, architecture docs, contributing guide |
| Containerization | 6 | Universal** | A Dockerfile/compose file, with a HEALTHCHECK |
* Modularity uses the Deep tier when an adapter supports the language; otherwise it falls back to a coarse module-size cohesion proxy and says so.
** Containerization applies to services only (backend, monorepo). For
CLIs, libraries, frontends, and mobile apps it's re-weighted out — they're never
penalized for not having a Dockerfile.
Rubric details
- −30 — no recognizable pattern (layered / hexagonal / feature / MVC) or code is flat
- −15 — entry points and business logic live at the same shallow level
- −15 — a single "god-folder" holds >60% of source files
Deep tier:
- −up to 35 — circular dependency cycles
- −up to 20 — modules with very high fan-out
- −up to 15 — god-modules with very high fan-in
- −10 — dependency chains deeper than 8
Universal fallback:
- −up to 25 — oversized modules (>400 lines) as a low-cohesion proxy
- −30 — essentially flat (no meaningful directories)
- −18 — layout doesn't match any recognized convention
- −8 — layout doesn't match the best convention for the project type
- −up to 12 — missing recommended directories for the project type
- −8 — excessive nesting (>8 levels)
- −55 — no tests at all
- −30 / −15 — very low / modest test-to-source ratio (banded)
- −10 — no integration/e2e layer in a non-trivial codebase
- −8 — no CI configuration running tests
- −25 — a concrete
.envfile committed - −up to 20 — hardcoded hosts/IPs/URLs in source
- −8 — reads env vars but has no
.env.example - −8 — no evidence of environment-based config
- −up to 30 — empty/swallowed catch blocks
- −12 — no timeout/retry/circuit-breaker signals (services)
- −10 — no centralized error-handling boundary
- −18 — no structured logging (bare prints) in a service
- −12 — no metrics/tracing instrumentation
- −10 — no health/readiness endpoint
- −up to 45 — likely hardcoded secrets/credentials
- −12 —
.envfiles exist but aren't git-ignored - −8 — manifest present but no lockfile committed
- −6 — no
.gitignore
- −40 — no README
- −18 — thin README (few words / headings)
- −10 — no architecture/design docs
- −5 — no CONTRIBUTING guide (larger projects)
Applies to backend and monorepo projects; re-weighted out (never penalized)
for CLIs, libraries, frontends, and mobile apps.
- −30 — no Dockerfile or docker-compose file for a service
- −12 — a Dockerfile is present but defines no
HEALTHCHECK
Grades: A ≥ 90, B ≥ 80, C ≥ 70, D ≥ 60, E ≥ 50, else F.
Folder-structure advisor
arch-score classifies your current structure (layered, hexagonal/clean,
feature/domain, mvc, or flat) and recommends the best structure for the
detected project type, with a concrete proposed tree and rationale:
| Project type | Recommended | Shape |
| --- | --- | --- |
| Backend API/service | Hexagonal / Clean | domain/ application/ infrastructure/ interfaces/http/ config/ |
| Frontend SPA | Feature-based | app/ features/<feature>/{components,hooks,api,state}/ shared/ |
| CLI | Layered | thin bin/ → commands/ → pure core/ → adapters/ |
| Library | Public surface | src/index + hidden internal/ |
| Mobile | Feature-modular | features/ navigation/ design-system/ |
| Monorepo | Workspaces | apps/ + packages/, each following its own type |
Run with --emit-md to get the full proposed tree and gap diff.
Guidance file generation
This is the differentiator: encode your project's conventions + the recommended architecture as actionable rules for AI coding assistants.
archscore . --emit-md # SYSTEM_DESIGN.md — human playbook
archscore . --emit-skill --format agents # AGENTS.md
archscore . --emit-skill --format claude # CLAUDE.md
archscore . --emit-skill --format cursor # .cursorrules
archscore . --emit-skill --format copilot # .github/copilot-instructions.mdSee examples/ for real generated output.
Output modes & CI
archscore . --ci --threshold 80 # exit code 1 if overall < 80
archscore . --json # full JSON report (graph summarized)
archscore . --html=report.html # self-contained HTML report--ci makes it a quality gate you can drop into any pipeline.
CI & badge
Add the arch-score GitHub Action to any repo to score it on every push/PR, optionally gate the build, post a score comment on PRs, and publish a live score badge.
Workflow (copy-paste — ~30 seconds)
# .github/workflows/arch-score.yml
name: arch-score
on:
push: { branches: [main] }
pull_request:
permissions:
contents: write # update the badge branch
pull-requests: write # post the PR comment
jobs:
arch-score:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: lakshaymeghlan/arch-score@v1
with:
threshold: "0" # set > 0 to fail the build below that scoreAction inputs: path (.), threshold (0 = never fail), comment (true), badge (true), badge-branch (arch-score-badge), version (latest). Outputs: score, grade, tier.
On a pull request it posts/updates a single sticky comment with the score table and top fixes; on a push to your default branch it commits a fresh badge to the arch-score-badge branch.
Badge
After the Action has run once, add the badge to your README (self-hosted SVG — no third-party service):
[](https://github.com/lakshaymeghlan/arch-score)Prefer the shields.io look? The Action also writes arch-score-badge.json (a shields endpoint) to the same branch:
The badge auto-updates every push. You can also generate badge files locally — fully offline — with archscore . --emit-badge --emit-badge-json.
The core tool stays 100% offline and zero-paid-dependency. The Action uses only GitHub's own first-party actions (
setup-node,github-script); nothing is added to the npm package.
Optional AI deep-review (--deep-ai)
The core tool is 100% offline and uses no AI. The optional --deep-ai flag
adds a qualitative architecture review on top — and it stays free:
- Local & free: install Ollama,
ollama pull llama3.1, thenarchscore . --deep-ai. Runs fully offline at zero cost. - Your own API key:
export ANTHROPIC_API_KEY=...andarch-scorewill use Anthropic instead. Your key is used directly and never bundled with the package.
If neither is available, --deep-ai prints a friendly note and the rest of the
report is unaffected. Only metrics and findings are sent — never your source code.
ARCHSCORE_AI_PROVIDER=ollama ARCHSCORE_AI_MODEL=llama3.1 archscore . --deep-ai
ANTHROPIC_API_KEY=sk-... ARCHSCORE_AI_MODEL=claude-sonnet-4-6 archscore . --deep-aiConfiguration
Drop an archscore.config.js (or .mjs / .json) in your project root. See
archscore.config.example.js:
export default {
weights: { architecture: 25, testing: 15 }, // re-normalized automatically
ignore: ["generated", "third_party"],
threshold: 75,
// projectType: "backend", // force instead of auto-detecting
};Programmatic API
import { analyzeProject, renderTerminal, generateSkill } from "arch-score";
const report = analyzeProject("./my-project");
console.log(report.overall, report.grade);
console.log(renderTerminal(report));
const claudeMd = generateSkill(report, "claude");Architecture (it eats its own dog food)
src/
core/ orchestrator, types, scoring engine, scanner, constants
detect/ language / framework / project-type detection
analyzers/ universal-tier checks — one module per category (Analyzer interface)
adapters/ deep-tier language plugins: js-ts, python, go (LangAdapter interface)
advisor/ folder-structure classification + recommendation
reporters/ terminal | json | html
generators/ SYSTEM_DESIGN.md, AI skill files, score badge (SVG/JSON), PR comment
ai/ optional --deep-ai (user's own key or local Ollama)
cli/ argument parsing + run loop
bin/ archscore executableAnalyzers and adapters are pluggable behind common interfaces, so new language analyzers and new checks drop in without touching the core.
Adding a language adapter
Implement LangAdapter (build an adjacency map of module → imports, hand it to
analyzeGraph), then register it in src/adapters/index.ts. That's it — the
Deep tier picks it up for matching projects automatically.
Development
npm install
npm run build
npm test # 59 unit + e2e tests
npm run selfscan # run arch-score on itself