npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

clustering-tfjs

v0.3.1

Published

High-performance TypeScript clustering algorithms (K-Means, Spectral, Agglomerative) with TensorFlow.js acceleration and scikit-learn compatibility

Readme

clustering-tfjs

npm version Build Status License: MIT TypeScript

Native TypeScript implementation of clustering algorithms powered by TensorFlow.js with full browser and Node.js support.

Features

  • ✅ Pure TypeScript/JavaScript (no Python required)
  • ✅ Multiple clustering algorithms (K-Means, Spectral, Agglomerative)
  • ✅ Powered by TensorFlow.js for performance
  • Works in both Node.js and browsers
  • ✅ Platform-optimized bundles (49KB for browser, 163KB for Node.js)
  • ✅ TypeScript support with full type definitions
  • ✅ GPU acceleration available (WebGL in browser, CUDA in Node.js)
  • ✅ Automatic backend selection
  • ✅ Extensively tested for parity with scikit-learn

Table of Contents

  1. Quick Start
  2. Installation
  3. Algorithms
  4. Validation Metrics
  5. Backend Selection
  6. API Reference
  7. Examples
  8. Performance
  9. Migration from scikit-learn
  10. Contributing
  11. License

Quick Start

Install

# For Node.js with acceleration
npm install clustering-tfjs @tensorflow/tfjs-node

# For Node.js with GPU support
npm install clustering-tfjs @tensorflow/tfjs-node-gpu

# For browser usage (TensorFlow.js loaded separately)
npm install clustering-tfjs

Note: For Windows users or if you encounter native binding issues, see our Windows Compatibility Guide.

Basic Usage

Browser

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/[email protected]/dist/tf.min.js"></script>
<script src="https://unpkg.com/clustering-tfjs/dist/clustering.browser.js"></script>

<script>
async function demo() {
  // Initialize the library
  await ClusteringTFJS.Clustering.init({ backend: 'webgl' });
  
  // Use algorithms
  const kmeans = new ClusteringTFJS.KMeans({ nClusters: 3 });
  const data = [[1, 2], [1.5, 1.8], [5, 8], [8, 8], [1, 0.6], [9, 11]];
  const labels = await kmeans.fitPredict(data);
  console.log(labels); // [0, 0, 1, 1, 0, 2]
}
demo();
</script>

Node.js

import { Clustering } from 'clustering-tfjs';

// Initialize (optional - auto-detects best backend)
await Clustering.init();

// Use algorithms
const kmeans = new Clustering.KMeans({ nClusters: 3 });
const data = [[1, 2], [1.5, 1.8], [5, 8], [8, 8], [1, 0.6], [9, 11]];
const labels = await kmeans.fitPredict(data);
console.log(labels); // [0, 0, 1, 1, 0, 2]

Installation

For Node.js

# Basic installation (pure JavaScript backend)
npm install clustering-tfjs

# Recommended: With native acceleration
npm install clustering-tfjs @tensorflow/tfjs-node

# Optional: With GPU support
npm install clustering-tfjs @tensorflow/tfjs-node-gpu

For Browser

The browser bundle is available via CDN:

<!-- Load TensorFlow.js -->
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/[email protected]/dist/tf.min.js"></script>

<!-- Load clustering-tfjs -->
<script src="https://unpkg.com/clustering-tfjs/dist/clustering.browser.js"></script>

Or install via npm and use with a bundler:

npm install clustering-tfjs @tensorflow/tfjs

Algorithms

K-Means Clustering

  • Classic centroid-based clustering
  • Supports custom initialization methods
  • K-Means++ initialization by default

Spectral Clustering

  • Graph-based clustering using eigendecomposition
  • Ideal for non-convex clusters
  • Supports custom affinity functions

Agglomerative Clustering

  • Hierarchical bottom-up clustering
  • Multiple linkage criteria (ward, complete, average, single)
  • Memory efficient implementation

Validation Metrics

The library includes three validation metrics to evaluate clustering quality and optimize the number of clusters:

Silhouette Score

Measures how similar an object is to its own cluster compared to other clusters. Range: [-1, 1], higher is better.

Davies-Bouldin Index

Evaluates intra-cluster and inter-cluster distances. Range: [0, ∞), lower is better.

Calinski-Harabasz Index

Ratio of between-cluster to within-cluster dispersion. Range: [0, ∞), higher is better.

Finding Optimal Number of Clusters

The library includes a built-in findOptimalClusters function that automatically determines the optimal number of clusters:

import { findOptimalClusters } from 'clustering-tfjs';

// Find optimal k between 2 and 10 clusters
const result = await findOptimalClusters(data, {
  minClusters: 2,
  maxClusters: 10,
  algorithm: 'kmeans'  // or 'spectral', 'agglomerative'
});

console.log(`Optimal number of clusters: ${result.optimal.k}`);
console.log(`Silhouette score: ${result.optimal.silhouette}`);
console.log(`All evaluations:`, result.evaluations);

// Advanced usage with custom scoring
const customResult = await findOptimalClusters(data, {
  maxClusters: 8,
  algorithm: 'spectral',
  algorithmParams: { affinity: 'nearest_neighbors' },
  metrics: ['silhouette', 'calinskiHarabasz'],  // Skip Davies-Bouldin
  scoringFunction: (evaluation) => evaluation.silhouette * 2 + evaluation.calinskiHarabasz
});

Platform Detection & Backend Selection

The library automatically detects your environment and selects the best backend:

import { Clustering } from 'clustering-tfjs';

// Check current platform
console.log('Platform:', Clustering.platform); // 'browser' or 'node'

// Check available features
console.log('Features:', Clustering.features);
// {
//   gpuAcceleration: true,
//   wasmSimd: false,
//   nodeBindings: true,
//   webgl: false
// }

// Manually select backend
await Clustering.init({ backend: 'webgl' }); // Browser
await Clustering.init({ backend: 'tensorflow' }); // Node.js

Available Backends

| Backend | Environment | Use Case | Performance | |---------|------------|----------|-------------| | cpu | Both | Pure JS fallback | Baseline | | webgl | Browser | GPU acceleration | 5-10x faster | | wasm | Browser | CPU optimization | 2-3x faster | | tensorflow | Node.js | Native bindings | 10-20x faster |

The library automatically selects the best available backend if not specified.

API Reference

Common Interface

All algorithms implement the same interface:

interface ClusteringAlgorithm {
  fit(X: Tensor2D | number[][]): Promise<void>;
  predict(X: Tensor2D | number[][]): Promise<number[]>;
  fitPredict(X: Tensor2D | number[][]): Promise<number[]>;
}

KMeans

new KMeans({
  nClusters: number;
  init?: 'k-means++' | 'random' | number[][];
  nInit?: number;
  maxIter?: number;
  tol?: number;
  // backend selection coming in future version
})

SpectralClustering

new SpectralClustering({
  nClusters: number;
  affinity?: 'rbf' | 'nearest_neighbors';
  gamma?: number;
  nNeighbors?: number;
  // backend selection coming in future version
})

AgglomerativeClustering

new AgglomerativeClustering({
  nClusters: number;
  linkage?: 'ward' | 'complete' | 'average' | 'single';
  // backend selection coming in future version
})

Validation Metrics

// Silhouette Score: [-1, 1], higher is better
silhouetteScore(X: Tensor2D | number[][], labels: number[]): Promise<number>

// Davies-Bouldin Index: [0, ∞), lower is better  
daviesBouldin(X: Tensor2D | number[][], labels: number[]): Promise<number>

// Calinski-Harabasz Index: [0, ∞), higher is better
calinskiHarabasz(X: Tensor2D | number[][], labels: number[]): Promise<number>

Examples

Coming soon: Example notebooks and CodePen demos

Performance

Based on our benchmarks:

  • K-Means: 0.5ms - 200ms depending on dataset size
  • Spectral: 10ms - 2s (includes eigendecomposition)
  • Agglomerative: 5ms - 500ms

See benchmarks/ for detailed performance data.

Migration from scikit-learn

# scikit-learn
from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=3)
labels = kmeans.fit_predict(X)
// clustering-js
import { KMeans } from 'clustering-tfjs';
const kmeans = new KMeans({ nClusters: 3 });
const labels = await kmeans.fitPredict(X);

Scikit-learn Compatibility

This library has been extensively tested for numerical parity with scikit-learn. Our test suite includes:

  • Step-by-step comparisons with sklearn implementations
  • Identical results for standard datasets
  • Matching behavior for edge cases

See tools/sklearn_comparison/ for detailed comparison scripts and test/ for parity tests.

Contributing

See CONTRIBUTING.md for guidelines on contributing to this project.

License

MIT


Note: This library is under active development. APIs may change in future versions.