npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@mcarvin/smart-diff

v2.1.0

Published

Summarizes a git diff using any LLM provider supported by the Vercel AI SDK (OpenAI, Anthropic, Google, Bedrock, Mistral, Cohere, Groq, xAI, DeepSeek, or any OpenAI-compatible gateway).

Readme

smart-diff

NPM License Downloads/week Maintainability codecov

TypeScript library that turns a git revision range into a Markdown summary using any LLM provider supported by the Vercel AI SDK — OpenAI, Anthropic, Google Gemini, Amazon Bedrock, Mistral, Cohere, Groq, xAI, DeepSeek, or any OpenAI-compatible gateway. It uses simple-git to read the repo, respects path includes/excludes and commit message include/exclude regexes, and sends commits, paths, structured diff stats, and unified diff text to the model.

Requirements

Installation

npm install @mcarvin/smart-diff

@ai-sdk/openai and @ai-sdk/openai-compatible ship as direct dependencies. Every other provider (@ai-sdk/anthropic, @ai-sdk/google, @ai-sdk/amazon-bedrock, @ai-sdk/mistral, @ai-sdk/cohere, @ai-sdk/groq, @ai-sdk/xai, @ai-sdk/deepseek) is declared as an optional peer and only needs to be installed when you actually use that provider. If the package is missing, smart-diff throws a clear error telling you which one to install.

Provider configuration

smart-diff is "configured" when isLlmProviderConfigured() returns true — i.e. at least one supported provider can be resolved from env vars — or you pass your own llmModelProvider factory. Otherwise summarizeGitDiff / generateSummary throw with LLM_GATEWAY_REQUIRED_MESSAGE.

Selecting a provider

LLM_PROVIDER explicitly selects a provider. When unset, the resolver auto-detects in this order: LLM_BASE_URL/OPENAI_BASE_URLopenai-compatible, OPENAI_API_KEY/LLM_API_KEYopenai, then ANTHROPIC_API_KEY, GOOGLE_GENERATIVE_AI_API_KEY (or GOOGLE_API_KEY), MISTRAL_API_KEY, COHERE_API_KEY, GROQ_API_KEY, XAI_API_KEY, DEEPSEEK_API_KEY, and finally OPENAI_DEFAULT_HEADERS/LLM_DEFAULT_HEADERSopenai.

| Provider (LLM_PROVIDER) | Package | Credential env vars | Default model | |---|---|---|---| | openai | @ai-sdk/openai | OPENAI_API_KEY or LLM_API_KEY | gpt-4o-mini | | openai-compatible | @ai-sdk/openai-compatible | LLM_BASE_URL or OPENAI_BASE_URL (required); OPENAI_API_KEY/LLM_API_KEY or custom headers | gpt-4o-mini | | anthropic | @ai-sdk/anthropic | ANTHROPIC_API_KEY | claude-3-5-haiku-latest | | google | @ai-sdk/google | GOOGLE_GENERATIVE_AI_API_KEY or GOOGLE_API_KEY | gemini-2.0-flash | | bedrock | @ai-sdk/amazon-bedrock | Standard AWS credential chain (env / profile / role) | anthropic.claude-3-5-haiku-20241022-v1:0 | | mistral | @ai-sdk/mistral | MISTRAL_API_KEY | mistral-small-latest | | cohere | @ai-sdk/cohere | COHERE_API_KEY | command-r-08-2024 | | groq | @ai-sdk/groq | GROQ_API_KEY | llama-3.1-8b-instant | | xai | @ai-sdk/xai | XAI_API_KEY | grok-2-latest | | deepseek | @ai-sdk/deepseek | DEEPSEEK_API_KEY | deepseek-chat |

LLM_* wins over OPENAI_* where both exist.

Common env vars

| Variable | Purpose | |---|---| | LLM_PROVIDER | Explicit provider id from the table above. | | LLM_MODEL | Overrides the per-provider default model id. | | OPENAI_BASE_URL / LLM_BASE_URL | Base URL for an OpenAI-compatible gateway; presence alone auto-selects the openai-compatible provider. | | OPENAI_DEFAULT_HEADERS / LLM_DEFAULT_HEADERS | JSON object of extra headers merged onto OpenAI / OpenAI-compatible requests (e.g. RBAC tokens, raw Authorization). LLM_* overrides OPENAI_* key-by-key. | | LLM_PROVIDER_NAME | Display name used when openai-compatible is active (defaults to openai-compatible). | | OPENAI_MAX_DIFF_CHARS / LLM_MAX_DIFF_CHARS | Max size of unified diff text sent to the model (default ~120k characters). | | OPENAI_MAX_TOKENS / LLM_MAX_TOKENS | Max completion tokens (default 4000). |

Example: native OpenAI

$env:OPENAI_API_KEY = "sk-..."
# Optional: $env:LLM_MODEL = "gpt-4o"

Example: Anthropic Claude

$env:ANTHROPIC_API_KEY = "sk-ant-..."
$env:LLM_MODEL = "claude-3-5-sonnet-latest"   # optional override

Example: company-managed OpenAI-compatible gateway

$env:OPENAI_BASE_URL = "https://llm-gateway.example.com"
$env:OPENAI_DEFAULT_HEADERS = '{"x-company-rbac":"your-rbac-token-here","Authorization":"Bearer sk-your-api-key-here"}'
# LLM_PROVIDER is auto-detected as "openai-compatible" because LLM_BASE_URL/OPENAI_BASE_URL is set.

Example: Google Gemini

$env:GOOGLE_GENERATIVE_AI_API_KEY = "..."
$env:LLM_MODEL = "gemini-2.0-flash"

Usage

summarizeGitDiff

import { summarizeGitDiff } from '@mcarvin/smart-diff';

const markdown = await summarizeGitDiff({
  from: 'origin/main',
  to: 'HEAD',
  cwd: '/path/to/repo', // optional; default process.cwd()
  includeFolders: ['src'],
  excludeFolders: ['node_modules', 'dist'],
  commitMessageExcludeRegexes: ['^\\[bot\\]'],
  commitMessageIncludeRegexes: ['^feat:'], // optional; OR across patterns
  teamName: 'Platform',
  systemPrompt: undefined,   // optional; overrides DEFAULT_GIT_DIFF_SYSTEM_PROMPT
  provider: 'anthropic',     // optional; overrides LLM_PROVIDER env + auto-detection
  model: 'claude-3-5-sonnet-latest', // optional
  maxDiffChars: 120_000,     // optional; also see LLM_MAX_DIFF_CHARS
});

| Option | Description | |--------|-------------| | from / to | Git refs for the range; to defaults to HEAD. | | cwd / git | Working tree for simple-git, or inject your own SimpleGit instance. | | includeFolders | Limit diff to these paths relative to repo root (omit for full repo minus excludes). | | excludeFolders | Excluded paths (git :(exclude) pathspecs), e.g. node_modules. | | commitMessageIncludeRegexes | If any pattern is non-empty, only commits whose full message matches at least one pattern are kept (after excludes). Case-insensitive. | | commitMessageExcludeRegexes | Drop commits whose message matches any of these patterns. | | teamName | Adds a Team: line to the user payload for the model. | | systemPrompt | Replaces the default system prompt. | | provider | LlmProviderId — wins over LLM_PROVIDER env and auto-detection. | | model | Chat model id; overrides LLM_MODEL and the provider default. | | maxDiffChars | Caps unified diff size for the request. | | contextLines | Number of context lines around each change (git diff -U<n>). Lower values (1 or 0) are the single biggest token saver on modification-heavy diffs. | | ignoreWhitespace | Passes -w / --ignore-all-space to git diff so pure-whitespace hunks don't consume tokens. Also applies to --numstat / --name-status so counts stay consistent. | | stripDiffPreamble | Removes low-value lines from the unified diff (diff --git, index, mode changes, similarity/rename/copy metadata). --- a/…, +++ b/…, and @@ hunk headers are kept. | | maxHunkLines | Caps the body of each hunk; anything past the limit is replaced with a single elision marker. The @@ header and DiffSummary totals are preserved. | | excludeDefaultNoise | Merges the built-in DEFAULT_NOISE_EXCLUDES list (lockfiles, dist, build, out, coverage, node_modules, __snapshots__) into excludeFolders. | | llmModelProvider | () => Promise<LanguageModel> — bypass env-based resolution entirely; hand-wire a Vercel AI SDK LanguageModel (required in tests or custom setups). |

Reducing tokens

For most repos, the cheapest wins are:

await summarizeGitDiff({
  from: 'origin/main',
  contextLines: 1,          // -U1 cuts 30-60% of tokens on typical diffs
  ignoreWhitespace: true,   // drop pure-whitespace hunks entirely
  stripDiffPreamble: true,  // kill `index`/`mode`/`similarity` lines
  maxHunkLines: 400,        // truncate monster hunks but keep the @@ header
  excludeDefaultNoise: true // skip lockfiles, dist/, coverage/, node_modules/
});

These options only reshape the unified diff text — the structured DiffSummary still reports true file counts and line totals, so the model always sees the full change inventory.

Injecting your own LanguageModel

If you want full control — for example, to configure retries, middlewares, or hit an in-process mock — pass llmModelProvider:

import { summarizeGitDiff } from '@mcarvin/smart-diff';
import { createAnthropic } from '@ai-sdk/anthropic';

const md = await summarizeGitDiff({
  from: 'origin/main',
  llmModelProvider: async () =>
    createAnthropic({ apiKey: process.env.MY_ANTHROPIC_KEY })(
      'claude-3-5-sonnet-latest',
    ),
});

Diff shape: single range vs per-commit

  • Single unified diff for from..to when no commit-message filters apply and the filtered commit list matches the full log for that range.
  • Concatenated per-commit patches (<hash>^!) when you use include/exclude regexes or when the filtered commit list differs in length from the full range (so the diff reflects only the commits that remain).

Lower-level API

The package also exports helpers for building a custom pipeline on top of the same git and LLM behavior:

  • Git: createGitClient, getRepoRoot, getCommits, getDiff, getDiffSummary, getChangedFiles, filterCommitsByMessageRegexes, buildDiffPathspecs, buildDiffShapingGitArgs, shapeUnifiedDiff, DEFAULT_NOISE_EXCLUDES
  • AI: generateSummary, resolveLlmMaxDiffChars, truncateUnifiedDiffForLlm
  • Provider resolution: resolveLanguageModel, detectLlmProvider, isLlmProviderConfigured, defaultModelForProvider, resolveLlmBaseUrl, parseLlmDefaultHeadersFromEnv
  • Constants / types: DEFAULT_GIT_DIFF_SYSTEM_PROMPT, LLM_GATEWAY_REQUIRED_MESSAGE, LlmProviderId, LlmModelProvider, ResolveLanguageModelOptions, GenerateSummaryInput, SummarizeFlags

Migrating from 1.x → 2.x

v2 replaces the direct openai SDK dependency with the Vercel AI SDK. If you only rely on env-var configuration, your setup keeps working — OPENAI_API_KEY, OPENAI_BASE_URL, OPENAI_DEFAULT_HEADERS, LLM_* equivalents, OPENAI_MAX_DIFF_CHARS, and OPENAI_MAX_TOKENS are all still honored.

Breaking changes:

  • Removed openAiClientProvider option on summarizeGitDiff/generateSummary. Use llmModelProvider: () => Promise<LanguageModel> returning a Vercel AI SDK model instead.
  • Removed OpenAiLikeClient and createOpenAiLikeClient exports, along with shouldUseLlmGateway. Use isLlmProviderConfigured() / resolveLanguageModel() instead.
  • openai npm package is no longer a dependency. Remove it from your own package.json if you only depended on it transitively via smart-diff.

Used By

This package is used by:

License

MIT — see LICENSE.md.