ngraph.leiden
v0.2.0
Published
Leiden/Louvain community detection for ngraph.graph (JS)
Maintainers
Readme
ngraph.leiden
Leiden/Louvain community detection for ngraph.graph. Fast, deterministic (seeded), and flexible: undirected/directed modularity, CPM with resolution, multilayer aggregation, fixed nodes, and custom weights/sizes.
Works with ESM and CommonJS. Built as a Vite library; tested with Vitest.
Install
Install the published package in your project:
npm i ngraph.leidenThis repository uses Node 18+ for development. To work on the repo locally:
npm install
npm test
npm run buildQuick start
import createGraph from 'ngraph.graph'
import { detectClusters } from 'ngraph.leiden'
const g = createGraph()
// Undirected graphs: just add a single link; adapter will symmetrize for you
g.addNode('a'); g.addNode('b'); g.addNode('c'); g.addNode('d')
g.addLink('a','b')
g.addLink('c','d')
const result = detectClusters(g, { randomSeed: 42 })
// Get a community id for a node
console.log('a ->', result.getClass('a'))
// Group membership as Map<communityId, nodeId[]>
console.log(result.getCommunities())
// Partition quality for the chosen objective
console.log('Q =', result.quality())CommonJS:
const createGraph = require('ngraph.graph')
const { detectClusters } = require('ngraph.leiden')API
Signature
detectClusters(input, options?) => Clusters
Input
input: either- an
ngraph.graphinstance; or - a multilayer array:
[{ graph, weight?, linkWeight?, nodeSize? }]— all layers must share the same node ids (see Multilayer graphs).
- an
Return value: Clusters
getClass(nodeId): number— community id fornodeId.getCommunities(): Map<number, string[]>— nodes grouped by community id.quality(): number— objective value for the final partition. Withquality: 'cpm', see CPM on how it’s computed andoptions.cpmMode.toJSON(): { membership: Record<id, number>, meta: { levels: number, quality: number, options: object } }.
Options
Each option below is optional. Defaults are shown in backticks.
quality
- What: Objective to optimize.
- Values:
'modularity'|'cpm'. Default:'modularity'. - Why: Use
'modularity'for classic community detection; use'cpm'(Constant Potts Model) when you need explicit control over community granularity viaresolutionor when modularity’s resolution limit is an issue. - When: Prefer
'cpm'for very large graphs or when you want consistent small communities; otherwise'modularity'is a strong default.
resolution (CPM)
- What: The
gammaparameter for CPM. - Value:
number. Default:1.0. - Why: Larger
resolutionyields more/smaller communities; smaller yields fewer/larger. - When: Tune to match desired granularity; try
0.1,1.0,5.0as starting points.
directed
- What: Enable Leicht–Newman directed modularity.
- Value:
boolean. Default:false. - Why: For directed networks where in/out strengths matter.
- When: Set
truefor directed graphs. For undirected, keepfalseand add reciprocal links in your graph.
randomSeed
- What: Seed for the internal RNG used for node order shuffles.
- Value:
number. Default:42. - Why: Ensures deterministic, reproducible partitions.
- When: Set explicitly for reproducible experiments or tests.
candidateStrategy
- What: Controls which target communities are considered when moving a node.
- Values:
'neighbors'(default),'all','random','random-neighbor'. - Why: Trade-off between speed and search breadth.
- When:
'neighbors': fastest and typical; consider only adjacent communities.'all': exhaustive but slow on large graphs.'random': sample from all communities (broader search at bounded cost).'random-neighbor': sample from neighbor communities (cheap, slightly more exploratory).
allowNewCommunity
- What: Allow moves that create a fresh singleton community.
- Value:
boolean. Default:false. - Why: Can help escape local minima in specific structures.
- When: Rarely needed; try enabling if you see over-merged communities.
maxCommunitySize
- What: Upper bound on a community’s total size.
- Value:
number. Default:Infinity. - How measured: Uses
nodeSize(default1per node). A move that would exceed this bound is skipped. - Why: Enforce capacity or balance constraints.
- When: Useful for constrained clustering or to avoid giant communities.
refine
- What: Enable Leiden-style refinement between coarsening levels.
- Value:
boolean. Default:true. - Why: Improves partitions by re-optimizing within coarse communities.
- When: Keep
truefor best quality; turn off for speed-sensitive runs.
fixedNodes
- What: Nodes that must remain in their initial communities at the finest level.
- Value:
SetorArrayof node ids. - Why: Respect domain constraints or anchor communities.
- When: Use to pin landmarks, seeds, or known group members.
preserveLabels
- What: Control how community ids are compacted after renumbering.
- Values:
false(default),true(preserve old order), orMap<oldId, order>. - Why: Stable ids across runs or alignment to a predefined ordering.
- When: Advanced; mostly for downstream integration/visualization.
linkWeight
- What: Function to read an edge’s weight.
- Signature:
(link) => number. Default:link.data?.weight ?? 1. - Why: Use custom attributes (e.g., frequency, strength) as weights.
- Example:
const res = detectClusters(g, { linkWeight: l => Math.max(0, l.data?.w ?? 0) })
nodeSize
- What: Function to read a node’s size (used by CPM and
maxCommunitySize). - Signature:
(node) => number. Default:node.data?.size ?? 1. - Why: Make CPM size-aware or weight capacity by node importance.
- Example:
const res = detectClusters(g, { nodeSize: n => n.data?.pop ?? 1, quality: 'cpm' })
maxLevels, maxLocalPasses
- What: Internal performance knobs for the multi-level loop and local passes.
- Values:
number. Defaults are conservative and usually fine. - Why/When: Only tune if you profile and identify a need.
Vocabulary
- Modularity: Measures how much more densely connected nodes are within communities than expected at random. Directed modularity (Leicht–Newman) uses in/out strengths. Higher is better; can suffer from a resolution limit on very large graphs.
- CPM (Constant Potts Model): Maximizes internal edge weight minus gamma × a penalty for community size. Gamma (resolution) controls granularity: larger gamma → more/smaller communities; smaller gamma → fewer/larger. Supports custom node sizes via nodeSize.
CPM
CPM is a resolution-tunable objective. During optimization this package uses node sizes (via nodeSize), so gains reflect community size; with default nodeSize=1 this matches “unit-count” CPM. If you supply custom node sizes, the optimization becomes size-aware. The quality() reporter supports:
- options.cpmMode: 'unit' | 'size-aware' (affects only how quality() is computed, not the move heuristic). If you use custom node sizes and want "unit-count" reporting, keep cpmMode='unit'.
Multilayer graphs
Pass an array of layers: [{ graph, weight?, linkWeight?, nodeSize? }]. Edges are aggregated by summing weight * linkWeight(link) per layer. All layers must have the same set of node ids. Node sizes default to the first layer’s nodeSize.
Example:
const result = detectClusters([
{ graph: layer1, weight: 1.0 },
{ graph: layer2, weight: 0.2 },
], { quality: 'modularity', randomSeed: 7 })When to use it
- Multiplex networks: combine different relation types (e.g., friendship + collaboration).
- Temporal smoothing: blend snapshots with weights to reduce short-term noise.
- Heterogeneous signals: fuse a strong–sparse layer with a weak–dense layer.
- Denoising: add a lightly weighted prior layer to stabilize communities.
Directed graphs
Set { directed: true } to use Leicht–Newman directed modularity. Edges are taken as-is; no reciprocal links are needed.
Undirected handling
- If directed: false (default), the adapter symmetrizes edges so you don’t need to add reciprocal links to your ngraph.graph.
- If both directions exist between two nodes with possibly different weights, the adapter averages them: w_undirected = (w_ij + w_ji) / 2, then emits symmetric edges with that weight. This preserves total weight and keeps results consistent whether you supplied one or both directions.
Constraints and ergonomics
- Fixed nodes: keep given nodes in their initial communities at the finest level (also respected during refinement).
- Community size limit: maxCommunitySize prevents moves that would exceed the given total nodeSize.
- Determinism: randomSeed controls shuffle order; repeated runs with the same seed and inputs are deterministic.
- Negative weights and self-loops are supported; use negative weights with care as modularity assumptions may not hold theoretically.
Examples
Modularity with custom weights
const g = createGraph()
g.addNode('x'); g.addNode('y'); g.addNode('z')
g.addLink('x','y', { weight: 2 }); g.addLink('y','x', { weight: 2 })
g.addLink('y','z', { weight: 0.3 }); g.addLink('z','y', { weight: 0.3 })
const res = detectClusters(g, { quality: 'modularity', randomSeed: 1, linkWeight: l => l.data?.weight ?? 1 })CPM with resolution and node sizes
const res = detectClusters(g, {
quality: 'cpm',
resolution: 0.5,
nodeSize: n => n.data?.size ?? 1,
randomSeed: 3,
})Fix a subset of nodes
const res = detectClusters(g, { fixedNodes: new Set(['x','z']), randomSeed: 11, refine: true })Build and test
- Build library bundles:
npm run build - Run tests:
npm test
License
MIT © Andrei Kashcha
CLI
This package ships a small CLI for quick community detection (via npx or installed locally).
Input format is auto-detected by file extension (.dot/.gv/.json) or by content (tries JSON first, then DOT). Use --format to override.
Usage examples
- From a DOT file
npx ngraph.leiden --in graph.dot --out membership.json- From stdin
cat graph.dot | npx ngraph.leiden --membership-only- From a JSON edgelist (either an array of {source,target,weight?} or {nodes,links})
cat edges.json | npx ngraph.leiden > out.jsonOptions
--directed— treat input as directed--qualitymodularity|cpm (default modularity)--resolution <gamma>— CPM resolution parameter--candidate-strategyneighbors|all|random|random-neighbor--max-levels,--max-local-passes,--random-seed--max-community-size <num>--allow-new-community,--no-refine--fixed <file>— newline-separated node ids to keep fixed at level 0--membership-only— print only the id->community map
Output
- Defaults to JSON on stdout; use
--out <file>to write to a file. - Formats:
- JSON (default): full object
{membership, meta}or mapping only with--membership-only. - CSV:
--out-format csvprints headernodeId,communityId. - DOT:
--out-format dotoverlayscommunity="..."on nodes using ngraph.todot.
- JSON (default): full object
Quality-only evaluation
Evaluate modularity/CPM for an existing membership mapping without running detection:
npx ngraph.leiden \
--in graph.dot --format dot \
--evaluate --membership membership.json \
--quality modularity