@mlinaresweb/code-context-diff
v0.1.0
Published
Multilanguage code context extractor from Git diffs for AI documentation, code review and automation workflows.
Maintainers
Readme
@mlinaresweb/code-context
Multilanguage code context extraction from Git diffs for AI documentation, code review, automated changelogs, pull request assistants and developer tooling.
@mlinaresweb/code-context takes a Git diff and builds a structured, AI-ready context pack by reading the changed files and discovering the surrounding code that matters: complete functions, classes, interfaces, imports, exports, dependency injection usage, framework references, WordPress hooks, AJAX handlers, REST routes, templates, CSS selectors, DOM ids, and cross-file usage.
The goal is simple: when an AI receives a diff, it should also receive the real code context needed to understand the change without sending the entire repository.
Table of contents
- Why this library exists
- What it can do
- Supported languages and ecosystems
- Installation
- Requirements
- Quick start
- Core concepts
- Getting a Git diff
- Main API
- Presets
- Documentation trust and strict mode
- WordPress support
- Debug reports and diagnostics
- Writing report files
- CLI/report script usage
- Configuration reference
- Output structure
- Examples
- Recommended workflows
- Performance and large repositories
- Troubleshooting
- Limitations
- Roadmap
- Contributing
- License and attribution
Why this library exists
Large Language Models can summarize code changes, write documentation and help review pull requests, but a raw diff is often not enough.
A diff may only show this:
- return this.userService.updateProfile(id, body);
+ return this.userService.updateProfile(id, body, context);But to understand it properly, the AI may need:
- the complete controller method,
- the service method,
- the DTO,
- the repository,
- dependency injection tokens,
- imports and path aliases,
- related framework decorators,
- usage references,
- tests or documentation constraints if available,
- and, in WordPress projects, templates, AJAX handlers, hooks, partials and frontend assets.
This library builds that context automatically.
It is especially useful when building tools like:
- AI documentation generators,
- AI PR reviewers,
- commit documentation assistants,
- technical changelog generators,
- codebase-aware agents,
- automated onboarding explainers,
- internal engineering documentation pipelines,
- CI quality gates for documentation.
What it can do
Diff-based changed file detection
The library parses standard Git diffs and detects changed files and changed line ranges.
It can also use stagedFiles as a fallback when the diff does not contain enough file metadata.
Complete symbol extraction
It extracts complete code regions around changed lines instead of sending isolated fragments.
Examples of extracted symbols:
- classes,
- methods,
- functions,
- interfaces,
- traits,
- components,
- controllers,
- services,
- hooks,
- callbacks,
- route handlers,
- template windows,
- fallback file windows when exact symbols cannot be resolved.
Cross-file context
The library can follow project relationships such as:
- imports,
- exports,
- local module references,
- usage references,
- property references,
- dependency injection references,
- framework references,
- WordPress template references,
- WordPress AJAX actions,
- WordPress hooks,
- CSS and DOM relationships.
AI-ready Markdown
It renders the extracted context into Markdown that can be passed directly to an AI prompt.
Structured JSON pack
It also returns a structured CodeContextPack object that can be inspected, stored, analyzed or transformed.
Quality and trust scoring
The report includes:
- quality score,
- context trust status,
- unresolved references,
- documentation readiness,
- diagnostics,
- warnings,
- recommendations.
Strict documentation mode
You can make the analysis fail if the context is not safe enough for documentation.
This is useful in CI or before publishing generated documentation.
Supported languages and ecosystems
The library is designed to be multilanguage and framework-aware.
Current focus:
| Ecosystem | Supported context | |---|---| | TypeScript | classes, methods, interfaces, imports, exports, DI, usage | | JavaScript | functions, classes, imports, frontend usage, globals | | TSX / JSX | React components, hooks, imports, usage | | Node.js | services, controllers, dependency graph | | NestJS | decorators, controllers, services, DTOs, DI context | | tsyringe | container resolution and dependency injection references | | Next.js | TSX/React context, route/app related code patterns | | Nuxt / Vue | SFC parsing, script blocks, provide/inject references | | PHP | classes, traits, interfaces, functions, procedural helpers | | WordPress | hooks, filters, AJAX, REST, shortcodes, templates, ACF, CSS/DOM | | Python | functions, classes, FastAPI-like dependency context | | CSS / SCSS | selectors, related DOM classes, style context | | SQL | fallback context for changed SQL files |
The design is intentionally extensible. More language-specific extractors can be added over time.
Installation
npm install @mlinaresweb/code-contextFor local development inside this repository:
npm install
npm run check
npm run test
npm run buildRequirements
- Node.js
>=20 - ESM project support
- A Git diff string as input
- Access to the local repository files being analyzed
The library reads files from the local filesystem. It does not need a remote backend.
Quick start
import { buildCodeContextReport } from '@mlinaresweb/code-context';
const diff = `
diff --git a/src/user.service.ts b/src/user.service.ts
index 1111111..2222222 100644
--- a/src/user.service.ts
+++ b/src/user.service.ts
@@ -1,6 +1,6 @@
export class UserService {
public getName(): string {
- return 'Ada';
+ return 'Ada Lovelace';
}
}
`;
const report = await buildCodeContextReport({
repositoryRoot: process.cwd(),
diff,
stagedFiles: ['src/user.service.ts'],
config: {
preset: 'fullstack',
},
documentationContext: {
mode: 'warn',
minimumTrustStatus: 'safe',
},
});
console.log(report.markdown);
console.log(report.quality.contextTrustStatus);
console.log(report.documentation.passed);Core concepts
CodeContextPack
The raw structured output.
It contains changed files, changed symbols, related symbols, references, diagnostics and warnings.
CodeContextReport
A higher-level result produced by buildCodeContextReport.
It includes:
packmarkdowndebugMarkdownsummaryqualitydocumentationwarnings
Changed symbols
Symbols directly affected by the diff.
Examples:
- changed function,
- changed method,
- changed class,
- changed template window,
- changed component.
Related symbols
Symbols not directly changed but required to understand the change.
Examples:
- imported service,
- DTO used by a controller,
- repository used by a service,
- AJAX handler called by frontend JS,
- CSS file defining classes used by a changed template,
- WordPress partial included by
get_template_part.
Diagnostics
Debug-level or quality-level facts generated during analysis.
Examples:
- WordPress index built,
- WordPress index cache hit,
- related file included,
- reference resolved,
- reference unresolved.
Documentation readiness
A strict or warning-based quality gate that tells you whether the context is safe enough to generate final documentation.
Getting a Git diff
The library expects a diff string. You can obtain it in different ways.
Staged changes
git diff --stagedNode example:
import { execFile } from 'node:child_process';
import { promisify } from 'node:util';
const execFileAsync = promisify(execFile);
const { stdout: diff } = await execFileAsync('git', [
'-C',
repositoryRoot,
'diff',
'--staged',
]);Unstaged changes
git diffCompare two commits
git diff HEAD~1..HEADCompare two branches
git diff main..feature/my-branchGet changed file names
git diff --name-only --stagedYou can pass those file names as stagedFiles:
const report = await buildCodeContextReport({
repositoryRoot,
diff,
stagedFiles,
});GitHub Pull Request diff
With GitHub CLI:
gh pr diff 123 > pr.diffThen read the file and pass it to the library:
import { readFile } from 'node:fs/promises';
const diff = await readFile('pr.diff', 'utf8');
const report = await buildCodeContextReport({
repositoryRoot,
diff,
});GitHub Actions example
name: Code Context
on:
pull_request:
jobs:
context:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm ci
- name: Generate diff
run: git diff origin/${{ github.base_ref }}...HEAD > pr.diff
- name: Generate code context report
run: node scripts/generate-code-context.jsMain API
buildCodeContextReport
Recommended for most users.
import { buildCodeContextReport } from '@mlinaresweb/code-context';
const report = await buildCodeContextReport({
repositoryRoot,
diff,
stagedFiles,
config: {
preset: ['fullstack', 'documentation'],
},
debugOptions: {
includePromptMarkdown: true,
includeDiagnostics: true,
},
documentationContext: {
mode: 'warn',
minimumTrustStatus: 'safe',
},
});Returns:
{
pack,
markdown,
debugMarkdown,
summary,
quality,
documentation,
warnings
}buildCodeContextPack
Lower-level API.
Use this if you want the raw context pack and plan to render or process it yourself.
import { buildCodeContextPack } from '@mlinaresweb/code-context';
const pack = await buildCodeContextPack({
repositoryRoot,
diff,
stagedFiles,
config: {
preset: 'balanced',
},
});renderCodeContextForPrompt
Render a CodeContextPack to Markdown.
import {
buildCodeContextPack,
renderCodeContextForPrompt,
} from '@mlinaresweb/code-context';
const pack = await buildCodeContextPack({
repositoryRoot,
diff,
});
const markdown = renderCodeContextForPrompt({
pack,
});writeCodeContextReportFiles
Writes ready-to-inspect files.
import { writeCodeContextReportFiles } from '@mlinaresweb/code-context';
const result = await writeCodeContextReportFiles({
repositoryRoot,
diff,
stagedFiles,
outputDirectory: '.code-context-report',
config: {
preset: 'wordpress-large',
},
});Generated files:
code-context-report.mdcode-context-debug.mdcode-context-pack.jsoncode-context-summary.json
Presets
Presets are the easiest way to configure the library.
config: {
preset: 'wordpress-large',
}You can combine presets:
config: {
preset: ['fullstack', 'wordpress-large', 'documentation'],
}Manual config always wins:
config: {
preset: 'wordpress-large',
maxCharacters: 900_000,
}Available presets
| Preset | Purpose |
|---|---|
| balanced | General-purpose default behavior |
| documentation | More context for AI documentation |
| fullstack | Broad JS/TS/PHP/Python/frontend backend context |
| wordpress | WordPress projects with moderate size |
| wordpress-large | Large WordPress themes/plugins and custom platforms |
| ci-fast | Faster analysis for CI pipelines |
Documentation trust and strict mode
The report includes a trust status:
report.quality.contextTrustStatusPossible values:
| Status | Meaning |
|---|---|
| safe | Context is good enough for final documentation |
| partial | Context is useful but should be reviewed |
| unsafe | Important context is missing |
Warn mode
Default recommended mode for interactive documentation tools.
documentationContext: {
mode: 'warn',
minimumTrustStatus: 'safe',
}The function does not throw. You can inspect:
report.documentation.passed
report.documentation.blockingIssuesStrict mode
Recommended for CI or publishing gates.
documentationContext: {
mode: 'strict',
minimumTrustStatus: 'safe',
}If the context does not meet the required trust level, the library throws:
CodeContextUnsafeDocumentationContextErrorExample:
import {
CodeContextUnsafeDocumentationContextError,
buildCodeContextReport,
} from '@mlinaresweb/code-context';
try {
const report = await buildCodeContextReport({
repositoryRoot,
diff,
documentationContext: {
mode: 'strict',
minimumTrustStatus: 'safe',
},
});
} catch (error) {
if (error instanceof CodeContextUnsafeDocumentationContextError) {
console.error(error.readiness.message);
console.error(error.quality.contextTrustStatus);
}
throw error;
}Permissive mode
Never blocks, even if context is unsafe.
documentationContext: {
mode: 'permissive',
minimumTrustStatus: 'safe',
}WordPress support
WordPress support is one of the strongest parts of the library.
It is designed for real-world projects where functionality is spread across:
- theme templates,
- plugin bootstrap files,
functions.php,- procedural helpers,
- classes,
- AJAX handlers,
- REST controllers,
- shortcodes,
- template parts,
- frontend JS,
- inline scripts,
- CSS/SCSS,
- ACF fields,
- custom post types,
- taxonomies,
- page builders and integrations.
WordPress Project Index
The library builds an automatic WordPress index for the current theme/plugin root.
It indexes:
- PHP functions,
- PHP classes,
- interfaces,
- traits,
- hooks,
- filters,
- AJAX actions,
- REST routes,
- shortcodes,
- template references,
- enqueue handles,
- asset paths,
- custom post types,
- taxonomies,
- ACF fields,
- blocks,
- JS functions,
windowglobals,- JS AJAX actions,
- CSS classes,
- DOM ids,
- file basenames,
- path tokens.
This is automatic. You do not need to hardcode project prefixes.
AJAX example
Changed template:
<script>
fetch('/wp-admin/admin-ajax.php?action=choose_delivery')
</script>The library tries to find:
add_action('wp_ajax_choose_delivery', '...');
add_action('wp_ajax_nopriv_choose_delivery', '...');and includes the handler context.
Template part example
Changed template:
get_template_part('template-parts/car-form/booking-summary');The library tries to include:
template-parts/car-form/booking-summary.phpCSS and DOM relationship example
Changed template:
<div id="checkout-summary" class="booking-summary-wrapper">The library can include related CSS/JS files that reference:
.booking-summary-wrapper {}
#checkout-summary {}or JS:
document.getElementById('checkout-summary')WordPress coverage
The library reports unresolved important references.
Examples:
- AJAX action used but no handler found,
- template referenced but file not found,
- PHP custom function called but not found,
- JS global used but not found,
- asset referenced but not found.
This helps prevent AI hallucination.
Debug reports and diagnostics
The debug report explains what happened internally.
console.log(report.debugMarkdown);It can include:
- summary,
- quality metrics,
- warnings,
- changed files,
- changed symbols,
- related symbols,
- references,
- diagnostics,
- WordPress Project Index debug,
- resolved references,
- unresolved references,
- LLM-ready Markdown.
Useful diagnostics include:
| Diagnostic code | Meaning |
|---|---|
| wordpress-index-summary | WordPress index was built |
| wordpress-index-cache-hit | Reused cached index |
| wordpress-index-cache-miss | Built new index |
| wordpress-related-file-included | Related file included |
| wordpress-reference-resolved | Reference resolved |
| wordpress-reference-unresolved | Reference unresolved |
Example:
const unresolved = report.pack.diagnostics?.filter((diagnostic) => {
return diagnostic.code === 'wordpress-reference-unresolved';
});Writing report files
Use writeCodeContextReportFiles to inspect context manually.
await writeCodeContextReportFiles({
repositoryRoot,
diff,
stagedFiles,
outputDirectory: './.code-context-report',
config: {
preset: ['wordpress-large', 'documentation'],
},
debugOptions: {
includePromptMarkdown: true,
includeDiagnostics: true,
},
});Output:
.code-context-report/
├─ code-context-report.md
├─ code-context-debug.md
├─ code-context-pack.json
└─ code-context-summary.jsonCLI/report script usage
If you use the included report script during development, you can generate reports from Git.
Staged changes
npm run report -- --repo "/path/to/project" --staged --out "./code-context-output"Unstaged changes
npm run report -- --repo "/path/to/project" --unstaged --out "./code-context-output"Commit range
npm run report -- --repo "/path/to/project" --base HEAD~1 --head HEAD --out "./code-context-output"WordPress large preset
npm run report -- --repo "/path/to/project" --staged --preset wordpress-large --out "./code-context-output"Multiple presets
npm run report -- --repo "/path/to/project" --staged --preset fullstack,wordpress-large,documentation --out "./code-context-output"Strict mode
npm run report -- --repo "/path/to/project" --staged --strict --minimum-trust safe --out "./code-context-output"Configuration reference
The config object is resolved as:
defaults < preset < user configExample:
config: {
preset: 'wordpress-large',
maxCharacters: 900_000,
}Common options
| Option | Description |
|---|---|
| enabled | Enable or disable code context |
| preset | Preset or list of presets |
| maxCharacters | Global context character budget |
| maxRelatedSymbols | Maximum related symbols |
| maxExternalFiles | Maximum external files to inspect/include |
| radiusLines | Fallback line radius around changed lines |
| dependencyResolutionEnabled | Follow imports/modules |
| frameworkDetectionEnabled | Detect framework references |
| frameworkSymbolLinkingEnabled | Link framework refs to symbols |
| frameworkDeepLinkingEnabled | Include deeper framework-related context |
| usageAnalysisEnabled | Extract usage references |
| crossFileUsageResolutionEnabled | Resolve usage across files |
| dependencyInjectionDetectionEnabled | Detect DI references |
| crossFileDependencyInjectionResolutionEnabled | Resolve DI across files |
| propertyInferenceEnabled | Extract property references |
| contextPrioritizationEnabled | Prioritize context by relevance |
| qualityValidationEnabled | Add quality validation |
WordPress options
| Option | Description |
|---|---|
| wordpressIndexCacheEnabled | Reuse WordPress index within analysis |
| wordpressIndexMaxFiles | Max files scanned by WordPress index |
Output structure
BuildCodeContextReportResult
interface BuildCodeContextReportResult {
readonly pack: CodeContextPack;
readonly markdown: string;
readonly debugMarkdown: string;
readonly summary: CodeContextReportSummary;
readonly quality: CodeContextQualityReport;
readonly documentation: CodeContextDocumentationReadiness;
readonly warnings: readonly string[];
}CodeContextReportSummary
Contains:
- total characters,
- changed file count,
- changed symbol count,
- related symbol count,
- reference counts,
- warning count,
- diagnostic count,
- languages,
- file paths,
- symbols by relevance.
CodeContextQualityReport
Contains:
status,contextTrustStatus,score,- coverage metrics,
- penalties,
- unresolved reference counts,
isSafeForDocumentation,- issues,
- recommendations.
Examples
Node / TypeScript service
const report = await buildCodeContextReport({
repositoryRoot,
diff,
stagedFiles: ['src/users/user.service.ts'],
config: {
preset: 'fullstack',
},
});The report may include:
- changed service method,
- imported repository,
- DTO,
- interface,
- dependency injection tokens,
- usage references.
NestJS
const report = await buildCodeContextReport({
repositoryRoot,
diff,
stagedFiles: ['apps/api/src/users/users.controller.ts'],
config: {
preset: ['fullstack', 'documentation'],
},
});WordPress template
const report = await buildCodeContextReport({
repositoryRoot,
diff,
stagedFiles: ['wp-content/themes/my-theme/single-car-checkout.php'],
config: {
preset: 'wordpress-large',
},
});Python
const report = await buildCodeContextReport({
repositoryRoot,
diff,
stagedFiles: ['app/api/users.py'],
config: {
preset: 'fullstack',
},
});React / Next
const report = await buildCodeContextReport({
repositoryRoot,
diff,
stagedFiles: ['app/users/UserProfile.tsx'],
config: {
preset: 'fullstack',
},
});Vue / Nuxt
const report = await buildCodeContextReport({
repositoryRoot,
diff,
stagedFiles: ['components/UserCard.vue'],
config: {
preset: 'fullstack',
},
});Recommended workflows
AI documentation generation
config: {
preset: ['fullstack', 'documentation'],
},
documentationContext: {
mode: 'warn',
minimumTrustStatus: 'safe',
}For large WordPress projects:
config: {
preset: ['wordpress-large', 'documentation'],
},
documentationContext: {
mode: 'warn',
minimumTrustStatus: 'safe',
}CI gate
documentationContext: {
mode: 'strict',
minimumTrustStatus: 'safe',
}Fast pull request summary
config: {
preset: 'ci-fast',
},
documentationContext: {
mode: 'warn',
minimumTrustStatus: 'partial',
}Performance and large repositories
For large repositories:
- use
wordpress-largeonly when needed, - keep
wordpressIndexCacheEnabled: true, - increase
wordpressIndexMaxFilesif the project is very large, - inspect
code-context-debug.md, - review cache diagnostics,
- increase
maxCharactersonly when the AI model can handle it.
Example:
config: {
preset: 'wordpress-large',
wordpressIndexMaxFiles: 10_000,
maxCharacters: 900_000,
maxExternalFiles: 220,
maxRelatedSymbols: 500,
}The WordPress index cache is per analysis execution. It does not persist between commands, which avoids stale results.
Troubleshooting
The report is unsafe
Check:
report.quality.criticalUnresolvedReferenceCount
report.quality.reportableUnresolvedReferenceCount
report.documentation.blockingIssuesOpen code-context-debug.md and look at:
- unresolved WordPress references,
- warnings,
- included files,
- coverage report.
A WordPress AJAX handler was not found
Make sure the project contains:
add_action('wp_ajax_my_action', 'my_callback');or:
add_action('wp_ajax_nopriv_my_action', 'my_callback');and that the file is inside the same theme/plugin root or within the configured scan limit.
A template part was not found
For:
get_template_part('template-parts/example');the library expects:
template-parts/example.phpA context report is too large
Reduce:
maxCharacters
maxRelatedSymbols
maxExternalFilesor use:
preset: 'ci-fast'A context report is missing important files
Increase:
maxExternalFiles
maxRelatedSymbols
wordpressIndexMaxFiles
maxCharactersand inspect diagnostics in debugMarkdown.
Limitations
No context extraction library can guarantee perfect results for every repository.
Known limitations:
- highly dynamic imports may not always resolve,
- runtime-generated WordPress hooks may not be fully detected,
- complex PHP variable-based template paths may need fallback context,
- minified files are intentionally not ideal inputs,
- very large files may be truncated by configured character budgets,
- framework-specific support grows over time.
The library is designed to prefer truthful, inspectable context over hallucinated certainty.
If something cannot be resolved, it should emit warnings or diagnostics.
Roadmap
Planned improvements:
- persistent optional cache,
- more language-specific extractors,
- deeper Python import resolution,
- deeper Next.js route/app directory awareness,
- deeper Nuxt auto-import awareness,
- improved PHP namespace/class resolution,
- better Composer/autoload understanding,
- better WordPress block/theme.json context,
- CLI package command,
- GitHub Action wrapper,
- HTML report output,
- VS Code extension integration.
Contributing
Contributions are welcome.
Good contribution areas:
- new language extractors,
- better framework support,
- more WordPress fixtures,
- performance improvements,
- diagnostics improvements,
- README/examples,
- bug reproductions with small fixtures.
Recommended workflow:
npm install
npm run check
npm run test
npm run buildBefore opening a pull request:
npm run check
npm run test
npm run build
npm run pack:dryPlease include tests for new extraction behavior.
License and attribution
This project is licensed under the MIT License.
That means you can:
- use it commercially,
- modify it,
- fork it,
- include it in your own tools,
- publish improvements,
- distribute copies.
The MIT License requires keeping the copyright and license notice in copies or substantial portions of the software.
Please preserve attribution to:
mlinareswebwhen reusing, forking or publishing derived versions of this library.
