@mlinaresweb/code-context-diff

v0.1.0

Published

20 days ago

Multilanguage code context extractor from Git diffs for AI documentation, code review and automation workflows.

0High
0Medium
0Low

mlinaresweb

ai code-context documentation git-diff wordpress php typescript javascript python react vue nuxt nextjs nestjs ast llm

@mlinaresweb/code-context

Multilanguage code context extraction from Git diffs for AI documentation, code review, automated changelogs, pull request assistants and developer tooling.

@mlinaresweb/code-context takes a Git diff and builds a structured, AI-ready context pack by reading the changed files and discovering the surrounding code that matters: complete functions, classes, interfaces, imports, exports, dependency injection usage, framework references, WordPress hooks, AJAX handlers, REST routes, templates, CSS selectors, DOM ids, and cross-file usage.

The goal is simple: when an AI receives a diff, it should also receive the real code context needed to understand the change without sending the entire repository.

Why this library exists

Large Language Models can summarize code changes, write documentation and help review pull requests, but a raw diff is often not enough.

A diff may only show this:

- return this.userService.updateProfile(id, body);
+ return this.userService.updateProfile(id, body, context);

But to understand it properly, the AI may need:

the complete controller method,
the service method,
the DTO,
the repository,
dependency injection tokens,
imports and path aliases,
related framework decorators,
usage references,
tests or documentation constraints if available,
and, in WordPress projects, templates, AJAX handlers, hooks, partials and frontend assets.

This library builds that context automatically.

It is especially useful when building tools like:

AI documentation generators,
AI PR reviewers,
commit documentation assistants,
technical changelog generators,
codebase-aware agents,
automated onboarding explainers,
internal engineering documentation pipelines,
CI quality gates for documentation.

What it can do

Diff-based changed file detection

The library parses standard Git diffs and detects changed files and changed line ranges.

It can also use stagedFiles as a fallback when the diff does not contain enough file metadata.

Complete symbol extraction

It extracts complete code regions around changed lines instead of sending isolated fragments.

Examples of extracted symbols:

classes,
methods,
functions,
interfaces,
traits,
components,
controllers,
services,
hooks,
callbacks,
route handlers,
template windows,
fallback file windows when exact symbols cannot be resolved.

Cross-file context

The library can follow project relationships such as:

imports,
exports,
local module references,
usage references,
property references,
dependency injection references,
framework references,
WordPress template references,
WordPress AJAX actions,
WordPress hooks,
CSS and DOM relationships.

AI-ready Markdown

It renders the extracted context into Markdown that can be passed directly to an AI prompt.

Structured JSON pack

It also returns a structured CodeContextPack object that can be inspected, stored, analyzed or transformed.

Quality and trust scoring

The report includes:

quality score,
context trust status,
unresolved references,
documentation readiness,
diagnostics,
warnings,
recommendations.

Strict documentation mode

You can make the analysis fail if the context is not safe enough for documentation.

This is useful in CI or before publishing generated documentation.

Supported languages and ecosystems

The library is designed to be multilanguage and framework-aware.

Current focus:

| Ecosystem | Supported context | |---|---| | TypeScript | classes, methods, interfaces, imports, exports, DI, usage | | JavaScript | functions, classes, imports, frontend usage, globals | | TSX / JSX | React components, hooks, imports, usage | | Node.js | services, controllers, dependency graph | | NestJS | decorators, controllers, services, DTOs, DI context | | tsyringe | container resolution and dependency injection references | | Next.js | TSX/React context, route/app related code patterns | | Nuxt / Vue | SFC parsing, script blocks, provide/inject references | | PHP | classes, traits, interfaces, functions, procedural helpers | | WordPress | hooks, filters, AJAX, REST, shortcodes, templates, ACF, CSS/DOM | | Python | functions, classes, FastAPI-like dependency context | | CSS / SCSS | selectors, related DOM classes, style context | | SQL | fallback context for changed SQL files |

The design is intentionally extensible. More language-specific extractors can be added over time.

Installation

npm install @mlinaresweb/code-context

For local development inside this repository:

npm install
npm run check
npm run test
npm run build

Requirements

Node.js >=20
ESM project support
A Git diff string as input
Access to the local repository files being analyzed

The library reads files from the local filesystem. It does not need a remote backend.

Quick start

import { buildCodeContextReport } from '@mlinaresweb/code-context';

const diff = `
diff --git a/src/user.service.ts b/src/user.service.ts
index 1111111..2222222 100644
--- a/src/user.service.ts
+++ b/src/user.service.ts
@@ -1,6 +1,6 @@
 export class UserService {
   public getName(): string {
-    return 'Ada';
+    return 'Ada Lovelace';
   }
 }
`;

const report = await buildCodeContextReport({
  repositoryRoot: process.cwd(),
  diff,
  stagedFiles: ['src/user.service.ts'],
  config: {
    preset: 'fullstack',
  },
  documentationContext: {
    mode: 'warn',
    minimumTrustStatus: 'safe',
  },
});

console.log(report.markdown);
console.log(report.quality.contextTrustStatus);
console.log(report.documentation.passed);

Core concepts

CodeContextPack

The raw structured output.

It contains changed files, changed symbols, related symbols, references, diagnostics and warnings.

CodeContextReport

A higher-level result produced by buildCodeContextReport.

It includes:

pack
markdown
debugMarkdown
summary
quality
documentation
warnings

Changed symbols

Symbols directly affected by the diff.

Examples:

changed function,
changed method,
changed class,
changed template window,
changed component.

Related symbols

Symbols not directly changed but required to understand the change.

Examples:

imported service,
DTO used by a controller,
repository used by a service,
AJAX handler called by frontend JS,
CSS file defining classes used by a changed template,
WordPress partial included by get_template_part.

Diagnostics

Debug-level or quality-level facts generated during analysis.

Examples:

WordPress index built,
WordPress index cache hit,
related file included,
reference resolved,
reference unresolved.

Documentation readiness

A strict or warning-based quality gate that tells you whether the context is safe enough to generate final documentation.

Getting a Git diff

The library expects a diff string. You can obtain it in different ways.

Staged changes

git diff --staged

Node example:

import { execFile } from 'node:child_process';
import { promisify } from 'node:util';

const execFileAsync = promisify(execFile);

const { stdout: diff } = await execFileAsync('git', [
  '-C',
  repositoryRoot,
  'diff',
  '--staged',
]);

Unstaged changes

git diff

Compare two commits

git diff HEAD~1..HEAD

Compare two branches

git diff main..feature/my-branch

Get changed file names

git diff --name-only --staged

You can pass those file names as stagedFiles:

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles,
});

GitHub Pull Request diff

With GitHub CLI:

gh pr diff 123 > pr.diff

Then read the file and pass it to the library:

import { readFile } from 'node:fs/promises';

const diff = await readFile('pr.diff', 'utf8');

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
});

GitHub Actions example

name: Code Context

on:
  pull_request:

jobs:
  context:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: actions/setup-node@v4
        with:
          node-version: 20

      - run: npm ci

      - name: Generate diff
        run: git diff origin/${{ github.base_ref }}...HEAD > pr.diff

      - name: Generate code context report
        run: node scripts/generate-code-context.js

Main API

buildCodeContextReport

Recommended for most users.

import { buildCodeContextReport } from '@mlinaresweb/code-context';

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles,
  config: {
    preset: ['fullstack', 'documentation'],
  },
  debugOptions: {
    includePromptMarkdown: true,
    includeDiagnostics: true,
  },
  documentationContext: {
    mode: 'warn',
    minimumTrustStatus: 'safe',
  },
});

Returns:

{
  pack,
  markdown,
  debugMarkdown,
  summary,
  quality,
  documentation,
  warnings
}

buildCodeContextPack

Lower-level API.

Use this if you want the raw context pack and plan to render or process it yourself.

import { buildCodeContextPack } from '@mlinaresweb/code-context';

const pack = await buildCodeContextPack({
  repositoryRoot,
  diff,
  stagedFiles,
  config: {
    preset: 'balanced',
  },
});

renderCodeContextForPrompt

Render a CodeContextPack to Markdown.

import {
  buildCodeContextPack,
  renderCodeContextForPrompt,
} from '@mlinaresweb/code-context';

const pack = await buildCodeContextPack({
  repositoryRoot,
  diff,
});

const markdown = renderCodeContextForPrompt({
  pack,
});

writeCodeContextReportFiles

Writes ready-to-inspect files.

import { writeCodeContextReportFiles } from '@mlinaresweb/code-context';

const result = await writeCodeContextReportFiles({
  repositoryRoot,
  diff,
  stagedFiles,
  outputDirectory: '.code-context-report',
  config: {
    preset: 'wordpress-large',
  },
});

Generated files:

code-context-report.md
code-context-debug.md
code-context-pack.json
code-context-summary.json

Presets

Presets are the easiest way to configure the library.

config: {
  preset: 'wordpress-large',
}

You can combine presets:

config: {
  preset: ['fullstack', 'wordpress-large', 'documentation'],
}

Manual config always wins:

config: {
  preset: 'wordpress-large',
  maxCharacters: 900_000,
}

Available presets

| Preset | Purpose | |---|---| | balanced | General-purpose default behavior | | documentation | More context for AI documentation | | fullstack | Broad JS/TS/PHP/Python/frontend backend context | | wordpress | WordPress projects with moderate size | | wordpress-large | Large WordPress themes/plugins and custom platforms | | ci-fast | Faster analysis for CI pipelines |

Documentation trust and strict mode

The report includes a trust status:

report.quality.contextTrustStatus

Possible values:

| Status | Meaning | |---|---| | safe | Context is good enough for final documentation | | partial | Context is useful but should be reviewed | | unsafe | Important context is missing |

Warn mode

Default recommended mode for interactive documentation tools.

documentationContext: {
  mode: 'warn',
  minimumTrustStatus: 'safe',
}

The function does not throw. You can inspect:

report.documentation.passed
report.documentation.blockingIssues

Strict mode

Recommended for CI or publishing gates.

documentationContext: {
  mode: 'strict',
  minimumTrustStatus: 'safe',
}

If the context does not meet the required trust level, the library throws:

CodeContextUnsafeDocumentationContextError

Example:

import {
  CodeContextUnsafeDocumentationContextError,
  buildCodeContextReport,
} from '@mlinaresweb/code-context';

try {
  const report = await buildCodeContextReport({
    repositoryRoot,
    diff,
    documentationContext: {
      mode: 'strict',
      minimumTrustStatus: 'safe',
    },
  });
} catch (error) {
  if (error instanceof CodeContextUnsafeDocumentationContextError) {
    console.error(error.readiness.message);
    console.error(error.quality.contextTrustStatus);
  }

  throw error;
}

Permissive mode

Never blocks, even if context is unsafe.

documentationContext: {
  mode: 'permissive',
  minimumTrustStatus: 'safe',
}

WordPress support

WordPress support is one of the strongest parts of the library.

It is designed for real-world projects where functionality is spread across:

theme templates,
plugin bootstrap files,
functions.php,
procedural helpers,
classes,
AJAX handlers,
REST controllers,
shortcodes,
template parts,
frontend JS,
inline scripts,
CSS/SCSS,
ACF fields,
custom post types,
taxonomies,
page builders and integrations.

WordPress Project Index

The library builds an automatic WordPress index for the current theme/plugin root.

It indexes:

PHP functions,
PHP classes,
interfaces,
traits,
hooks,
filters,
AJAX actions,
REST routes,
shortcodes,
template references,
enqueue handles,
asset paths,
custom post types,
taxonomies,
ACF fields,
blocks,
JS functions,
window globals,
JS AJAX actions,
CSS classes,
DOM ids,
file basenames,
path tokens.

This is automatic. You do not need to hardcode project prefixes.

AJAX example

Changed template:

<script>
fetch('/wp-admin/admin-ajax.php?action=choose_delivery')
</script>

The library tries to find:

add_action('wp_ajax_choose_delivery', '...');
add_action('wp_ajax_nopriv_choose_delivery', '...');

and includes the handler context.

Template part example

Changed template:

get_template_part('template-parts/car-form/booking-summary');

The library tries to include:

template-parts/car-form/booking-summary.php

CSS and DOM relationship example

Changed template:

<div id="checkout-summary" class="booking-summary-wrapper">

The library can include related CSS/JS files that reference:

.booking-summary-wrapper {}
#checkout-summary {}

or JS:

document.getElementById('checkout-summary')

WordPress coverage

The library reports unresolved important references.

Examples:

AJAX action used but no handler found,
template referenced but file not found,
PHP custom function called but not found,
JS global used but not found,
asset referenced but not found.

This helps prevent AI hallucination.

Debug reports and diagnostics

The debug report explains what happened internally.

console.log(report.debugMarkdown);

It can include:

summary,
quality metrics,
warnings,
changed files,
changed symbols,
related symbols,
references,
diagnostics,
WordPress Project Index debug,
resolved references,
unresolved references,
LLM-ready Markdown.

Useful diagnostics include:

| Diagnostic code | Meaning | |---|---| | wordpress-index-summary | WordPress index was built | | wordpress-index-cache-hit | Reused cached index | | wordpress-index-cache-miss | Built new index | | wordpress-related-file-included | Related file included | | wordpress-reference-resolved | Reference resolved | | wordpress-reference-unresolved | Reference unresolved |

Example:

const unresolved = report.pack.diagnostics?.filter((diagnostic) => {
  return diagnostic.code === 'wordpress-reference-unresolved';
});

Writing report files

Use writeCodeContextReportFiles to inspect context manually.

await writeCodeContextReportFiles({
  repositoryRoot,
  diff,
  stagedFiles,
  outputDirectory: './.code-context-report',
  config: {
    preset: ['wordpress-large', 'documentation'],
  },
  debugOptions: {
    includePromptMarkdown: true,
    includeDiagnostics: true,
  },
});

Output:

.code-context-report/
├─ code-context-report.md
├─ code-context-debug.md
├─ code-context-pack.json
└─ code-context-summary.json

CLI/report script usage

If you use the included report script during development, you can generate reports from Git.

Staged changes

npm run report -- --repo "/path/to/project" --staged --out "./code-context-output"

Unstaged changes

npm run report -- --repo "/path/to/project" --unstaged --out "./code-context-output"

Commit range

npm run report -- --repo "/path/to/project" --base HEAD~1 --head HEAD --out "./code-context-output"

WordPress large preset

npm run report -- --repo "/path/to/project" --staged --preset wordpress-large --out "./code-context-output"

Multiple presets

npm run report -- --repo "/path/to/project" --staged --preset fullstack,wordpress-large,documentation --out "./code-context-output"

Strict mode

npm run report -- --repo "/path/to/project" --staged --strict --minimum-trust safe --out "./code-context-output"

Configuration reference

The config object is resolved as:

defaults < preset < user config

Example:

config: {
  preset: 'wordpress-large',
  maxCharacters: 900_000,
}

Common options

| Option | Description | |---|---| | enabled | Enable or disable code context | | preset | Preset or list of presets | | maxCharacters | Global context character budget | | maxRelatedSymbols | Maximum related symbols | | maxExternalFiles | Maximum external files to inspect/include | | radiusLines | Fallback line radius around changed lines | | dependencyResolutionEnabled | Follow imports/modules | | frameworkDetectionEnabled | Detect framework references | | frameworkSymbolLinkingEnabled | Link framework refs to symbols | | frameworkDeepLinkingEnabled | Include deeper framework-related context | | usageAnalysisEnabled | Extract usage references | | crossFileUsageResolutionEnabled | Resolve usage across files | | dependencyInjectionDetectionEnabled | Detect DI references | | crossFileDependencyInjectionResolutionEnabled | Resolve DI across files | | propertyInferenceEnabled | Extract property references | | contextPrioritizationEnabled | Prioritize context by relevance | | qualityValidationEnabled | Add quality validation |

WordPress options

| Option | Description | |---|---| | wordpressIndexCacheEnabled | Reuse WordPress index within analysis | | wordpressIndexMaxFiles | Max files scanned by WordPress index |

Output structure

BuildCodeContextReportResult

interface BuildCodeContextReportResult {
  readonly pack: CodeContextPack;
  readonly markdown: string;
  readonly debugMarkdown: string;
  readonly summary: CodeContextReportSummary;
  readonly quality: CodeContextQualityReport;
  readonly documentation: CodeContextDocumentationReadiness;
  readonly warnings: readonly string[];
}

CodeContextReportSummary

Contains:

total characters,
changed file count,
changed symbol count,
related symbol count,
reference counts,
warning count,
diagnostic count,
languages,
file paths,
symbols by relevance.

CodeContextQualityReport

Contains:

status,
contextTrustStatus,
score,
coverage metrics,
penalties,
unresolved reference counts,
isSafeForDocumentation,
issues,
recommendations.

Examples

Node / TypeScript service

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles: ['src/users/user.service.ts'],
  config: {
    preset: 'fullstack',
  },
});

The report may include:

changed service method,
imported repository,
DTO,
interface,
dependency injection tokens,
usage references.

NestJS

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles: ['apps/api/src/users/users.controller.ts'],
  config: {
    preset: ['fullstack', 'documentation'],
  },
});

WordPress template

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles: ['wp-content/themes/my-theme/single-car-checkout.php'],
  config: {
    preset: 'wordpress-large',
  },
});

Python

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles: ['app/api/users.py'],
  config: {
    preset: 'fullstack',
  },
});

React / Next

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles: ['app/users/UserProfile.tsx'],
  config: {
    preset: 'fullstack',
  },
});

Vue / Nuxt

const report = await buildCodeContextReport({
  repositoryRoot,
  diff,
  stagedFiles: ['components/UserCard.vue'],
  config: {
    preset: 'fullstack',
  },
});

Recommended workflows

AI documentation generation

config: {
  preset: ['fullstack', 'documentation'],
},
documentationContext: {
  mode: 'warn',
  minimumTrustStatus: 'safe',
}

For large WordPress projects:

config: {
  preset: ['wordpress-large', 'documentation'],
},
documentationContext: {
  mode: 'warn',
  minimumTrustStatus: 'safe',
}

CI gate

documentationContext: {
  mode: 'strict',
  minimumTrustStatus: 'safe',
}

Fast pull request summary

config: {
  preset: 'ci-fast',
},
documentationContext: {
  mode: 'warn',
  minimumTrustStatus: 'partial',
}

Performance and large repositories

For large repositories:

use wordpress-large only when needed,
keep wordpressIndexCacheEnabled: true,
increase wordpressIndexMaxFiles if the project is very large,
inspect code-context-debug.md,
review cache diagnostics,
increase maxCharacters only when the AI model can handle it.

Example:

config: {
  preset: 'wordpress-large',
  wordpressIndexMaxFiles: 10_000,
  maxCharacters: 900_000,
  maxExternalFiles: 220,
  maxRelatedSymbols: 500,
}

The WordPress index cache is per analysis execution. It does not persist between commands, which avoids stale results.

Troubleshooting

The report is `unsafe`

Check:

report.quality.criticalUnresolvedReferenceCount
report.quality.reportableUnresolvedReferenceCount
report.documentation.blockingIssues

Open code-context-debug.md and look at:

unresolved WordPress references,
warnings,
included files,
coverage report.

A WordPress AJAX handler was not found

Make sure the project contains:

add_action('wp_ajax_my_action', 'my_callback');

or:

add_action('wp_ajax_nopriv_my_action', 'my_callback');

and that the file is inside the same theme/plugin root or within the configured scan limit.

A template part was not found

For:

get_template_part('template-parts/example');

the library expects:

template-parts/example.php

A context report is too large

Reduce:

maxCharacters
maxRelatedSymbols
maxExternalFiles

or use:

preset: 'ci-fast'

A context report is missing important files

Increase:

maxExternalFiles
maxRelatedSymbols
wordpressIndexMaxFiles
maxCharacters

and inspect diagnostics in debugMarkdown.

Limitations

No context extraction library can guarantee perfect results for every repository.

Known limitations:

highly dynamic imports may not always resolve,
runtime-generated WordPress hooks may not be fully detected,
complex PHP variable-based template paths may need fallback context,
minified files are intentionally not ideal inputs,
very large files may be truncated by configured character budgets,
framework-specific support grows over time.

The library is designed to prefer truthful, inspectable context over hallucinated certainty.

If something cannot be resolved, it should emit warnings or diagnostics.

Roadmap

Planned improvements:

persistent optional cache,
more language-specific extractors,
deeper Python import resolution,
deeper Next.js route/app directory awareness,
deeper Nuxt auto-import awareness,
improved PHP namespace/class resolution,
better Composer/autoload understanding,
better WordPress block/theme.json context,
CLI package command,
GitHub Action wrapper,
HTML report output,
VS Code extension integration.

Contributing

Contributions are welcome.

Good contribution areas:

new language extractors,
better framework support,
more WordPress fixtures,
performance improvements,
diagnostics improvements,
README/examples,
bug reproductions with small fixtures.

Recommended workflow:

npm install
npm run check
npm run test
npm run build

Before opening a pull request:

npm run check
npm run test
npm run build
npm run pack:dry

Please include tests for new extraction behavior.

License and attribution

This project is licensed under the MIT License.

That means you can:

use it commercially,
modify it,
fork it,
include it in your own tools,
publish improvements,
distribute copies.

The MIT License requires keeping the copyright and license notice in copies or substantial portions of the software.

Please preserve attribution to:

mlinaresweb

when reusing, forking or publishing derived versions of this library.

See LICENSE and NOTICE.md.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@mlinaresweb/code-context

Table of contents

Why this library exists

What it can do

Diff-based changed file detection

Complete symbol extraction

Cross-file context

AI-ready Markdown

Structured JSON pack

Quality and trust scoring

Strict documentation mode

Supported languages and ecosystems

Installation

Requirements

Quick start

Core concepts

CodeContextPack

CodeContextReport

Changed symbols

Related symbols

Diagnostics

Documentation readiness

Getting a Git diff

Staged changes

Unstaged changes

Compare two commits

Compare two branches

Get changed file names

GitHub Pull Request diff

GitHub Actions example

Main API

buildCodeContextReport

buildCodeContextPack

renderCodeContextForPrompt

writeCodeContextReportFiles

Presets

Available presets

Documentation trust and strict mode

Warn mode

Strict mode

Permissive mode

WordPress support

WordPress Project Index

AJAX example

Template part example

CSS and DOM relationship example

WordPress coverage

Debug reports and diagnostics

Writing report files

CLI/report script usage

Staged changes

Unstaged changes

Commit range

WordPress large preset

Multiple presets

Strict mode

Configuration reference

Common options

WordPress options

Output structure

BuildCodeContextReportResult

CodeContextReportSummary

CodeContextQualityReport

Examples

Node / TypeScript service

NestJS

WordPress template

Python

React / Next

Vue / Nuxt

Recommended workflows

AI documentation generation

CI gate

Fast pull request summary

The report is `unsafe`