@jbeuckm/k-means-js

v0.5.0

Published

2 years ago

A basic Javascript implementation of the cluster analysis algorithm.

Downloads

0High
0Medium
0Low

jbeuckm

machine learning cluster analysis k-means

K-Means Clustering

A basic Javascript implementation of the [cluster analysis] 1 algorithm.

Usage

Optionally, normalize the data.

The normalizer will scale numerical data between [0,1] and will generate n outputs of either zero or one for discrete data, eg. category.

// Tell the normalizer about the category field.
var params = {
   category: "discrete"
};

// Category is a discrete field with two possible values.
// Value is a linear field with continuous possible values.
var data = [
    {
       category: "a",
       value: 25
    },
    {
       category: "b",
       value: 7.6
    },
    {
       category: "a",
       value: 28
    }
];


var ranges = require('dataset').findRanges(params, data);
var normalized = require('dataset').normalize(data, ranges);

Run the algorithm.

// This non-normalized sample data with n=k is a pretty awful example.
var points = [
  [.1, .2, .3],
  [.4, .5, .6],
  [.7, .8, .9]
];

var k = 3;

var means = require('kmeans').algorithm(points, k, console.log);

The call to algorithm() will find the data's range in each dimension, generate k=3 random points, and iterate until the means are static.

Find the best K

The method described by Pham, et al. is implemented. The algorithm evaluates K-means repeatedly for different values of K, and returns the best (guess) value for K as well as the set of means found during evaluation.

var pbk = require('phamBestK');

var maxKToTest = 10;
var result = pbk.findBestK(points, maxKToTest);

console.log("this data has "+result.K+" clusters");
console.log("cluster centroids = "+result.means);

Denormalize data

Denormalization can be used to show the means discovered:

for (var i= 0, l=result.means.length; i<l; i++) {
    console.log(dataset.denormalizeDatum(result.means[i], ranges));
}

Todo

denormalize data
provide ability to label data points, dimensions and means
build an asynchronous version of the algorithm

Pkg
Stats

Discover Tips

General search

Package details

User packages

Sponsor

About

Twitter

GitHub

Twitter

GitHub

Site

Open Software & Tools

Framework

Server

Data Store

Caching

CSS / Styling

Typeface

Avatars

Data Viz

Date formatting

Infinite scrolling

Markdown rendering

Repository url parsing

User data

Compiling

Types

Odds & Ends

@jbeuckm/k-means-js

v0.5.0

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

K-Means Clustering

Usage

Todo