repo-memory-graph

v0.1.0

Published

2 months ago

Entry-based static analysis for JavaScript and TypeScript: resolve imports, extract dependencies/functions/calls, build graphs, and optionally detect queue or worker enqueue patterns (Babel AST, streaming or batch API).

0High
0Medium
0Low

hrwelcome

ast static-analysis dependency-graph imports typescript javascript babel module-graph queue worker bull enqueue

repo-memory-graph

ESM package for static analysis of JavaScript and TypeScript: start from one file, follow static imports / require / string import(), parse only reachable files, and get dependencies, functions, calls, and an optional queue / worker call map.

Node.js 18+ (native ESM).

Install

npm install repo-memory-graph

Types are included (dist/index.d.ts).

Quick start

import { analyze, analyzeStream } from "repo-memory-graph";

const entry = "/absolute/path/to/your/app/src/main.ts";
const repoDir = "/absolute/path/to/your/app";

const result = analyze(entry, repoDir);
console.log(Object.keys(result.files).length, "files");

entry may be absolute or relative to process.cwd(). repoDir is the project root used for resolution and for staying inside the repo (unless includeNodeModules is set).

Exported API (functions, constant, types)

There are 10 functions, 1 constant object, and 15 TypeScript types exported from repo-memory-graph.

Analysis entry points

`analyze(filePath, repoDir, options?)`

Purpose: Run the full walk once and return everything in a single object: every reachable FileRecord, the combined graph, and issues.

When to use: Smaller graphs, scripts, or when you need result.graph without folding the stream yourself.

import { analyze } from "repo-memory-graph";

const { files, graph, issues, entry, repoDir } = analyze("./src/index.ts", "/path/to/repo", {
  includeNodeModules: false,
  queueWorker: {}, // optional
});

const firstPath = Object.keys(files)[0];
console.log(files[firstPath]!.calls.length, "calls in", firstPath);
console.log(issues.length, "issues (e.g. unresolved imports)");

`analyzeStream(filePath, repoDir, options?)`

Purpose: Same traversal as analyze, but yields { kind: "file", record } or { kind: "issue", issue } one at a time. The library does not keep earlier files in memory after each yield.

When to use: Large graphs, bounded memory, or incremental sinks (DB, logs, transforms).

import { analyzeStream } from "repo-memory-graph";

for (const event of analyzeStream("./src/index.ts", "/path/to/repo")) {
  if (event.kind === "file") {
    const { path, dependencies, calls } = event.record;
    if (dependencies.some((d) => !d.resolved)) {
      console.warn("unresolved in", path);
    }
  } else {
    console.warn(event.issue.message, event.issue.path);
  }
}

`analyzeReadable(filePath, repoDir, options?)`

Purpose: Expose the same events as analyzeStream through a Node.js Readable in objectMode: true, so you can pipe or consume with stream APIs.

When to use: Integrating with pipeline(), backpressure, or tools that expect a stream.

import { createWriteStream } from "node:fs";
import { pipeline } from "node:stream/promises";
import { analyzeReadable } from "repo-memory-graph";

const readable = analyzeReadable("./src/index.ts", "/path/to/repo");
// Example: consume manually
readable.on("data", (event) => {
  if (event.kind === "file") console.log(event.record.path);
});

Queue / worker graph helpers

`buildQueueWorkerEdges(matches)`

Purpose: Turn a flat list of QueueWorkerMatch (from many files) into deduplicated edges: fromFile is the file that contains the call; optional targetWorker / jobType come from static argument patterns.

import { analyzeStream, buildQueueWorkerEdges } from "repo-memory-graph";

const matches = [];
for (const ev of analyzeStream(entry, repoDir, { queueWorker: {} })) {
  if (ev.kind === "file" && ev.record.queueWorkerMatches) {
    matches.push(...ev.record.queueWorkerMatches);
  }
}
const edges = buildQueueWorkerEdges(matches);
console.table(edges.map((e) => [e.fromFile, e.targetWorker, e.jobType, e.line]));

`DEFAULT_QUEUE_WORKER_CONFIG` (constant)

Purpose: Baseline rules for queue-style calls (callee substrings, empty regex/keyword lists by default, indices for worker vs payload args, jobTypeProperty: "type"). Merge or override with mergeQueueWorkerConfig.

import { DEFAULT_QUEUE_WORKER_CONFIG } from "repo-memory-graph";

console.log(DEFAULT_QUEUE_WORKER_CONFIG.calleeSubstrings);
// includes "enqueueJob", "enqueue", "addJob", etc.

`mergeQueueWorkerConfig(base, override)`

Purpose: Shallow merge two configs; any array or scalar you pass on override replaces the corresponding field on base (useful to layer JSON + defaults).

import { DEFAULT_QUEUE_WORKER_CONFIG, mergeQueueWorkerConfig } from "repo-memory-graph";

const strict = mergeQueueWorkerConfig(DEFAULT_QUEUE_WORKER_CONFIG, {
  calleeSubstrings: ["enqueueJob"],
  argumentKeywords: [],
});

`loadQueueWorkerConfigFromFile(filePath)`

Purpose: Read a JSON file, validate basic shapes, and merge with DEFAULT_QUEUE_WORKER_CONFIG. Throws if JSON is invalid or not an object.

import { loadQueueWorkerConfigFromFile } from "repo-memory-graph";

const cfg = loadQueueWorkerConfigFromFile("/path/to/queue-worker.json");
// pass as analyze(..., { queueWorker: cfg })

`resolveQueueWorkerOption(queueWorker, cwd?)`

Purpose: What analyze / analyzeStream use internally: if the option is a string, load JSON from disk (relative to cwd, default process.cwd()); if an object, merge with defaults; if undefined, return undefined (no queue scanning).

import { resolveQueueWorkerOption } from "repo-memory-graph";

const cfg = resolveQueueWorkerOption("./config/queue-worker.json");
const cfg2 = resolveQueueWorkerOption({ calleeSubstrings: ["addJob"] });

Low-level AST helpers (Babel)

Use these when you already have a Babel AST (e.g. a custom Babel plugin or script using @babel/parser). AST node types come from @babel/types (a dependency of repo-memory-graph; install it alongside if your tooling needs explicit types).

`calleeToString(callee)`

Purpose: Serialize a call’s callee expression to a short string, e.g. queueManager.enqueueJob, import, ?.enqueue.

import * as t from "@babel/types";
import { calleeToString } from "repo-memory-graph";

const callee = t.memberExpression(t.identifier("queueManager"), t.identifier("enqueueJob"));
calleeToString(callee); // "queueManager.enqueueJob"

`tailMemberPropertyName(expression)`

Purpose: For workerConstants.email, return the last property name ("email"). Used when inferring a static “worker key” from the first enqueue argument.

import * as t from "@babel/types";
import { tailMemberPropertyName } from "repo-memory-graph";

const expr = t.memberExpression(t.identifier("workerConstants"), t.identifier("email"));
tailMemberPropertyName(expr); // "email"

`matchQueueWorkerCall(filePath, path, config, enclosing?)`

Purpose: Given a NodePath<CallExpression | OptionalCallExpression> from @babel/traverse, return a QueueWorkerMatch or null using the same rules as the main analyzer. Lets you reuse detection outside extractFile.

import { parse } from "@babel/parser";
import traverse from "@babel/traverse";
import { matchQueueWorkerCall, DEFAULT_QUEUE_WORKER_CONFIG } from "repo-memory-graph";

const ast = parse("queueManager.enqueueJob(w.email, { type: 'x' });", {
  sourceType: "module",
});
traverse(ast, {
  CallExpression(path) {
    const m = matchQueueWorkerCall("/app/foo.ts", path, DEFAULT_QUEUE_WORKER_CONFIG);
    console.log(m?.jobType, m?.targetWorker);
    path.stop();
  },
});

`FileRecord`

| Field | Content | |-------|---------| | path | Absolute path | | dependencies | { specifier, target, resolved }[] | | functions | Declarations / methods / arrows (name, line, kind) | | calls | Callee string, line, optional enclosing function name | | queueWorkerMatches | Present when options.queueWorker is set (may be empty) |

Options (`AnalyzeOptions`)

| Option | Type | Default | Purpose | |--------|------|---------|---------| | extensions | string[] | [".tsx", ".ts", …] | Resolution order for extensionless paths | | modules | string[] | [repoDir/node_modules, "node_modules"] | Resolver modules paths | | includeNodeModules | boolean | false | Follow and parse under node_modules when resolved | | queueWorker | QueueWorkerConfig \| string | — | Enable queue/worker detection; string = JSON file path |

queueWorker JSON example (merged with defaults):

{
  "calleeSubstrings": ["enqueueJob"],
  "calleeRegexes": [],
  "argumentKeywords": [],
  "jobTypeProperty": "type",
  "jobPayloadArgIndex": 1,
  "workerTargetArgIndex": 0
}

Exported TypeScript types

These are compile-time only (export type from the package entry):

AnalysisGraph, AnalysisIssue, AnalysisResult, AnalysisStreamEvent, AnalyzeOptions, CallInfo, DependencyEdge, FileRecord, FunctionInfo, GraphEdge, GraphNode, QueueWorkerConfig, QueueWorkerEdge, QueueWorkerMatch, QueueWorkerMatchReason.

Choosing `analyze` vs `analyzeStream`

| Situation | Prefer | |-----------|--------| | Need result.files + result.graph in one shot | analyze | | Large graph or stream-shaped pipeline | analyzeStream / analyzeReadable |

Behavior and limits

Only modules reachable from the entry via static imports / require / string import() are analyzed.
Each file is read fully for parsing; streaming avoids the library retaining all FileRecords, not token-by-token disk streaming.
Unresolved specifiers appear in issues and in dependencies with resolved: false.
TypeScript paths / baseUrl are not applied; resolution uses enhanced-resolve in Node/bundler style.

Publishing (maintainers)

Use a granular npm token or account with publish rights. Put the token only in .env (already gitignored), not in git:

# .env — never commit (use a granular npm access token)
NPM_TOKEN=your_token_here

Then:

npm run publish:npm

scripts/publish.mjs merges //registry.npmjs.org/:_authToken=… into a local .npmrc (also gitignored), runs npm publish --access public, then restores or removes .npmrc so the token is not left in the repo.

NODE_AUTH_TOKEN in .env is accepted as an alias for NPM_TOKEN.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

repo-memory-graph

Install

Quick start

Exported API (functions, constant, types)

Analysis entry points

analyze(filePath, repoDir, options?)

analyzeStream(filePath, repoDir, options?)

analyzeReadable(filePath, repoDir, options?)

Queue / worker graph helpers

buildQueueWorkerEdges(matches)

DEFAULT_QUEUE_WORKER_CONFIG (constant)

mergeQueueWorkerConfig(base, override)

loadQueueWorkerConfigFromFile(filePath)

resolveQueueWorkerOption(queueWorker, cwd?)

Low-level AST helpers (Babel)

calleeToString(callee)

tailMemberPropertyName(expression)

matchQueueWorkerCall(filePath, path, config, enclosing?)

FileRecord

Options (AnalyzeOptions)

Exported TypeScript types

Choosing analyze vs analyzeStream

Behavior and limits

Publishing (maintainers)

License

`analyze(filePath, repoDir, options?)`

`analyzeStream(filePath, repoDir, options?)`

`analyzeReadable(filePath, repoDir, options?)`

`buildQueueWorkerEdges(matches)`

`DEFAULT_QUEUE_WORKER_CONFIG` (constant)

`mergeQueueWorkerConfig(base, override)`

`loadQueueWorkerConfigFromFile(filePath)`

`resolveQueueWorkerOption(queueWorker, cwd?)`

`calleeToString(callee)`

`tailMemberPropertyName(expression)`

`matchQueueWorkerCall(filePath, path, config, enclosing?)`

`FileRecord`

Options (`AnalyzeOptions`)

Choosing `analyze` vs `analyzeStream`