als-statistics

v2.1.0

Published

3 months ago

Modular JS statistics toolkit for Node.js and the browser: descriptive stats, correlations (Pearson/Spearman/Kendall), t-tests & ANOVA (Student/Welch), reliability (Cronbach’s alpha), regression (linear/logistic), clustering (DBSCAN/HDBSCAN), and table/co

ALS Statistics

ALS Statistics is a modular JS toolkit for statistical work. It’s designed to be:

Quality: Numerics verified: this release matches Python (NumPy/SciPy) reference outputs across modules and passes the deterministic Golden Test Suite on Node.js 20.x, all within published EPS tolerances. Reproducible via node goldens/test.js and npm test.
Easy to use like Math for small one-liners;
Composable for multi-step analyses (filter → group → compare → summarize);
Runtime-agnostic — the same API in Node.js and in the browser;
Data-model light — works with plain arrays (number[]) and small helpers like Column and Table.
Browser-ready. No native dependencies; works in the browser (as ESM or via the included UMD bundle). Think of it as a “batteries-included” stats toolbox rather than a full data-frame ecosystem. If you know SPSS: ALS gives you many of the common procedures (correlations, t-tests, ANOVA, reliability, basic clustering, regression) with code-first ergonomics. If you know NumPy/SciPy: ALS focuses on analytics primitives and convenience wrappers (no heavy data containers, no plotting).

Why the rewrite?

The v1 architecture had grown too complex (intertwined modules, heavy abstractions), which made adding features and maintaining consistency difficult.
v2 was rebuilt from scratch with a simpler core (plain arrays + lightweight Column/Table), clear module boundaries, and predictable numerics—so new analytical tools can be added quickly without increasing complexity.

Key ideas

Plain data in / plain results out. Most functions take { [name]: number[] } or number[] and return simple objects (e.g. { r, t, df, p }).
Two modes of use:
1. One-liners via descriptive helpers (mean, stdDev, percentiles…).
2. Structured analyzers for correlations, mean comparisons, regressions, clustering, etc.
Table utilities. Sort, filter, split by group, compute derived columns, and feed the result to an analyzer.

Installation

npm i als-statistics

Usage in browser

<script type="module" src="/node_modules/als-statistics/lib/index.js"></script>
or
<script src="/node_modules/als-statistics/statistics.js"></script>
or
<script type="module">
   import Statistics from '/node_modules/als-statistics/lib/index.js'

</script>

NodeJS

import { Analyze, Stats, Table, Column } from 'als-statistics';
// or
const { Analyze, Stats, Table, Column } = require('als-statistics/statistics.cjs')

const { CDF, CompareMeans, Correlate, Clustering, Regression } = Analyze;

const { constants, t, f, phi } = CDF;
const { IndependentTTest, OneWayAnova, PairedTTest, OneSampleTTest } = CompareMeans;
const { CronbachAlpha, Pearson, Spearman, Kendall } = Correlate;
const { Dbscan, Hdbscan, computeDistances } = Clustering;
const { LinearRegression, LogisticRegression } = Regression;

// Descriptive stats (one-liners)
const {
   sum, mean,median,mode, min, max, // central tendency
   variance, varianceSample, stdDev, stdDevSample, cv, range, iqr, mad, // dispersion & scale
   percentile, q1, q3, p10, p90, // position & percentiles
   zScore, zScores, zScoresSorted, outliersZScore, outliersIQR,  // z-scores & outliers
   weightedMean, confidenceInterval, slope, regressionSlope,  // misc
   spectralPowerDensityArray, spectralPowerDensityMetric,
   sorted, ma, sumOfSquares, flatness, skewness, kurtosis,  // other statistics
   skewnessSample, kurtosisSample, geometricMean, harmonicMean,
   noiseStability, frequencies, relativeFrequencies, 
   relativeDispersion, normalizedValues, xValues, 
   recode, // recode values
  } = Stats;

The package is modular — import only what you use.

Quick starts

1) Use it like `Math` (one-liners)

import { Stats } from 'als-statistics';

const X = [10, 12, 13, 9, 14];

const mu  = Stats.mean(X);
const sd  = Stats.stdDevSample(X);
const p90 = Stats.p90(X);

console.log({ mu, sd, p90 });
// → { mu: 11.6, sd: 1.923..., p90: 13.8 }

You can also access many metrics via Column:

import { Column } from 'als-statistics';

const col = new Column([10, 12, 13, 9, 14], 'Score');
const { mean, stdDev, median, frequencies, flatness } = col;

2) Quick analysis: correlation in one line

import { Analyze } from 'als-statistics';

const data = {
  gender: [0, 1, 0, 1, 1, 0], // 0=female, 1=male
  score:  [62, 75, 70, 81, 64, 78],
};

const pearson = new Analyze.Correlate(data).pearson('gender', 'score');
const { r, t, df, p } = pearson;

console.log({ r, t, df, p });
// r in [-1, 1], two-sided p-value in [0, 1]

3) Compare means: Welch t-test (unequal variances)

import { Analyze } from 'als-statistics';

const data = {
  men:   [62, 75, 70, 81, 64],
  women: [78, 73, 69, 71, 74, 77],
};

const test = new Analyze.CompareMeans(data).independentWelch('men', 'women');
console.log({ t: test.t, df: test.df, p: test.p });

4) One-way ANOVA (classic & Welch)

import { Analyze } from 'als-statistics';
const { CompareMeans } = Analyze;

const data = {
  A: [10, 11,  9, 10],
  B: [10, 30, -10, 50, -20],
  C: [12, 13, 12, 11, 14],
};

const classic = new CompareMeans(data).anova();       // pooled (equal variances)
const welch   = new CompareMeans(data).anovaWelch();  // unequal variances

console.log({
  classic: { F: classic.F, df1: classic.dfBetween, df2: classic.dfWithin, p: classic.p },
  welch:   { F: welch.F,   df1: welch.dfBetween,   df2: welch.dfWithin,   p: welch.p },
});

5) Table-first workflow (filter → split → analyze)

import { Table } from 'als-statistics';

const t = new Table(
  { gender: [0,1,0,1,1,0], age: [21,22,20,23,19,22], score: [62,75,70,81,64,78] },
  { name: 'Survey' }
);

// Keep adults 21+
t.filterRowsBy('age', a => a >= 21);

// Compare score by gender with Welch
// Option A: already split into columns:
import { Analyze } from 'als-statistics';
const { CompareMeans } = Analyze;

const cm = new CompareMeans({ men: [...], women: [...] }).independentWelch('men', 'women');

// Option B: split first, then pass to CompareMeans:
const groups = t.splitBy('gender'); // returns { groupName: number[] }
const test = new CompareMeans(groups).independentWelch('0', '1');

Data managing (Tables and Columns)

This section explains how data flows through Columns, Tables and Statistics: validation rules, caching, safe updates, and the most common operations you’ll use before running analytics.

Notes & pitfalls

Always mutate via API. Use Column mutators or the values setter; avoid direct array mutation to keep caches correct.
Invalids. Column.invalid stores indices of rejected values; descriptives and analyses ignore them.
Mutability. Most Table methods are in-place and return this. Prefer clone() when you need a safe branch.
Alignment. If you disable alignment and keep ragged columns, be mindful when exporting rows or running analyses that expect equal lengths.
HTML output. htmlTable() is for quick previews; for full reports, prefer exporting rows and rendering via your own templates.

Column

Quick API snapshot

import Statistics ,{ Table, Column } from 'als-statistics';

// Column
// static
Column.key(name, ...parts)
// properties/getters
col.name
col.labels?         // optional labels aligned with values
col.invalid         // indices of invalid inputs
col.values          // get/set (validated)
col.n               // length
// cache/events
col.$(key, compute) // memoize custom computations
col.onchange(fn)    // subscribe to structural changes
// mutation helpers (invalidate caches automatically)
col.addValue(value, index?)
col.deleteValue(index)
col.clone(name?)
col.insertAt(index, ...items)
col.setAt(index, item)
col.removeAt(index, deleteCount=1)
col.splice(start, deleteCount, ...items)
col.push(...items)
// descriptive on Column (same names as Stats one-liners)
col.sum, col.mean, col.median, col.mode
col.variance, col.varianceSample, col.stdDev, col.stdDevSample, col.cv, col.range, col.iqr, col.mad
col.percentile(p), col.q1, col.q3, col.p10, col.p90
col.zScore(v), col.zScores(), col.zScoresSorted(), col.outliersZScore(z=3), col.outliersIQR()
col.weightedMean(weights), col.confidenceInterval, col.slope, col.regressionSlope(customX)
col.spectralPowerDensityArray, col.spectralPowerDensityMetric

How it works (principles)

Validation-first. Columns accept only finite numbers. Any non-finite input (NaN, ±Infinity, non-number) is rejected or tracked via col.invalid, and excluded from descriptive metrics.
Cached results. Many results are cached (e.g., col.mean, col.stdDev). To keep caches correct, you must not mutate the underlying array directly.
Instead, either:
- assign a new array via the validated setter: col.values = [...newNumbers], or
- use the provided mutators (setAt, splice, push, …).
  These paths automatically invalidate caches and fire onchange events.
Alignment in tables. By default, a Table aligns columns to a common length (truncates to the shortest column). You can change this behavior with constructor options (e.g., alignColumns: false, minK) or call t.alignColumns() explicitly.
In-place transforms. Most Table methods mutate. Chain them freely, or use clone() to keep the original around.

Creating and validating

import { Column } from 'als-statistics';

const scores = new Column([10, 12, 13, 9, 14], 'Score');

// set a new validated series (replaces data, clears caches)
scores.values = [11, 11, 10, 12, 15];

// invalid values are tracked and excluded from stats
scores.values = [11, 12, NaN, 10, 9, Infinity];
console.log(scores.invalid); // [2, 5]
console.log(scores.mean);    // mean over valid entries only

Do not mutate scores.values in place (e.g., scores.values[0] = 999), as caches won’t know about it. Use setAt(...) instead.

Safe mutations (cache-aware)

// append values
scores.push(10, 11);

// insert at position
scores.insertAt(1, 99);

// replace a single value
scores.setAt(0, 12);

// delete & splice
scores.deleteValue(2);
scores.splice(3, 1, 50, 51);

All of these invalidate caches and emit onchange:

scores.onchange((col, prev, meta) => {
  console.log('column changed:', meta.type)
});

Caching your own computations

// memoize expensive custom metric
const kurt = scores.$('kurtosis', () => {
  // compute once, then served from cache until data changes
  return scores.kurtosis; // or any custom formula
});

Descriptives on Column

Every descriptive method available in Stats exists on Column too and always respects validation/caching:

console.log({
  mean: scores.mean,
  sd  : scores.stdDevSample,
  q1  : scores.q1,
  p90 : scores.p90,
  outliersZ: scores.outliersZScore(3)
});

Table

Quick API snapshot

import { Table } from 'als-statistics';
const t = new Table(data?, { name?, minK?, alignColumns? })
// properties/getters
t.n                // rows count
t.k                // columns count
t.columns          // map of Column
t.colNames         // string[]
t.colValues        // Record<string, number[]>
t.json             // plain object view
// row/column transforms (in-place; use clone() to branch)
t.addColumn(name, values, labels?) -> Column
t.deleteColumn(name) -> this
t.addRow(row, index?) -> this
t.addRows(rows, index?) -> this
t.deleteRow(index) -> this
t.alignColumns() -> this

// data shaping
t.recode(colName, mapper, newColName?) -> void
t.compute(fn, name) -> Column
t.filterRows(indexes) -> this
t.filterRowsBy(colName, predicate) -> this
t.sortBy(colName, asc=true) -> this
t.clone(name?, colFilter=[]) -> Table
t.splitBy(colName, labels?) -> Statistics
t.transpose(colNames=[]) -> Table
t.where(rowPredicate) -> number[]
t.rows(withKeys=true) -> object[] | any[][]
t.htmlTable(colFilter=[], options?) -> string
t.descriptive(...metricNames) -> Object{} // Descriptive statistics for all columns

// analysis shortcuts
t.correlate(...colFilter) -> Correlate
t.compareMeans(...colFilter) -> CompareMeans
t.dbscan(colFilter, options?) -> Dbscan
t.hdbscan(colFilter, options?) -> Hdbscan
t.regression(yName, xNames, type='linear'|'logistic') -> Regression
t.linear(yName, xNames)
t.logistic(yName, xNames)

Tip: operations on Table are mutable by default (they change the same instance). Use t.clone(...) to branch a copy for “what-if” scenarios.

Constructing and alignment

import { Table } from 'als-statistics';

const t = new Table(
  { gender: [0,1,0,1,1,0], age: [21,22,20,23,19], score: [62,75,70,81,64,78] },
  { name: 'Survey', alignColumns: true, minK: 2 }
);

// When alignColumns=true (default), columns are trimmed to the shortest length.
// You can turn this off via { alignColumns: false } if you need ragged columns.
console.log(t.n, t.k, t.colNames); // rows, columns, names

// Access Column objects
const scoreCol = t.columns['score'];
console.log(scoreCol.mean);

Rows & columns (synchronization)

// add/delete columns
t.addColumn('bmi', [22.1, 24.0, 23.7, 25.3, 21.8]);
t.deleteColumn('age');

// add rows (object keys match column names)
t.addRow({ gender: 0, score: 71, bmi: 23.1 });
t.addRows([
  { gender: 1, score: 68, bmi: 24.2 },
  { gender: 0, score: 77, bmi: 22.7 }
]);

// delete rows
t.deleteRow(0);

// re-align explicitly if needed
t.alignColumns();

Data shaping

// recode values (e.g., 0/1 -> 'F'/'M'), optionally write to a new column
t.recode('gender', g => (g === 0 ? 'F' : 'M'), 'genderLabel');

// compute a derived numeric column
t.compute(row => row.score / (row.bmi ?? 1), 'scorePerBmi');

// filter & sort (in place)
t.filterRowsBy('score', s => s >= 70);
t.sortBy('score', /*asc=*/false);

// pick rows by predicate (returns indices)
const adultIdx = t.where(row => row.bmi >= 22 && row.bmi <= 25);

// grab data in different shapes
const rowsAsObjects = t.rows(true);
const rowsAsArrays  = t.rows(false);
const html = t.htmlTable(['genderLabel','score','bmi']);

Split & analyze

// split one column into groups, then run an analysis
const groups = t.splitBy('genderLabel'); // => { F: number[], M: number[] }

import { Analyze } from 'als-statistics';
const test = new Analyze.CompareMeans(groups).independentWelch('F', 'M');
console.log({ t: test.t, df: test.df, p: test.p });

// or use shortcuts directly from Table
const corr = t.correlate('score','bmi').pearson();
console.log({ r: corr.r, p: corr.p });

Transpose and clone

// transpose a subset of columns (handy for certain distance/clustering operations)
const t2 = t.transpose(['score','bmi']);

// clone to branch a scenario without touching the original
const tClone = t.clone('scenario: filtered', ['score','bmi']);

Statistics (multi-table manager)

Statistics is a lightweight coordinator for multiple Tables. It lets you:

register tables (addTable),
compute the union of available column names (colNames),
combine the same columns from different tables into a new Table (columns(...)),
remove tables (deleteTable),
and access the module namespace (static): Statistics.Table, Statistics.Stats, Statistics.Analyze, Statistics.Column.

It’s especially handy for before/after designs, or when you split one table by a factor and then want to analyze the resulting groups together.

API

new Statistics(name?: string)

statistics.addTable(obj: Record<string, number[]>, options?: { name?: string, minK?: number, alignColumns?: boolean }): Table
statistics.deleteTable(tableName: string): void

// set of distinct column names across all registered tables
statistics.colNames: string[]

// Combine selected columns (from *every* table that has them) into a new Table.
// Result columns are named `${tableName}_${colName}`.
statistics.columns(name: string, ...colFilter: (string|RegExp)[]): Table

// Static accessors (namespaces)
Statistics.Table
Statistics.Stats
Statistics.Analyze
Statistics.Column

Column selection (`colFilter`)

columns(name, ...colFilter) uses the same filtering helper as Table:

pass exact names: columns('X', 'score')
pass regex: columns('X', /^score|age$/)
exclude by prefixing with -: columns('X', 'score', '-score_z')

Examples

1) Before/After (paired)

import Statistics from 'als-statistics';
const { CompareMeans } = Statistics.Analyze;

const S = new Statistics('A/B');

// register two tables with the same column name "score"
S.addTable({ score: [62, 71, 69, 73, 75] }, { name: 'before' });
S.addTable({ score: [70, 76, 70, 78, 79] }, { name: 'after' });

// collect score columns from all tables into one Table
const merged = S.columns('Scores', 'score');  // -> columns: before_score, after_score

// run paired t-test using the Table shortcut
const paired = merged.compareMeans('before_score', 'after_score').paired();
console.log({ t: paired.t, df: paired.df, p: paired.p });

2) Split → Combine → Independent Welch

import { Table } from 'als-statistics';
import Statistics from 'als-statistics';
const { CompareMeans } = Statistics.Analyze;

const t = new Table(
  { group: [0,1,0,1,0,1], score: [62,75,70,81,64,78] },
  { name: 'Survey' }
);

// split by "group" → returns a Statistics instance with one table per group
const S = t.splitBy('group', { 0: 'control', 1: 'treat' });

// bring the "score" columns from each split table into ONE Table
const merged = S.columns('scored', 'score');  // control_score, treat_score

const test = merged.compareMeans('control_score','treat_score').independentWelch();
console.log({ t: test.t, df: test.df, p: test.p });

3) Cross-table correlation

const merged = S.columns('ab', 'score');  // e.g., before_score, after_score
const corr   = merged.correlate('before_score','after_score').pearson();
console.log({ r: corr.r, p: corr.p });

Scenarios

A) Before/After (pre→post) in separate tables

import Statistics from 'als-statistics';
const S = new Statistics();

S.addTable('pre',   preTable);
S.addTable('post',  postTable);

// Merge the same column name from multiple tables
const merged = S.columns('merged', 'score'); // pre_score, post_score

const cm = merged.compareMeans('pre_score','post_score').paired();
console.log({ t: cm.t, df: cm.df, p: cm.p });

B) Split → Combine workflow

const S = new Statistics();
S.addTable('raw', rawTable);

// Split by factor into two new tables
const { control, treat } = S.split('raw', by => by.group === 'A' ? 'control' : 'treat');

// Combine same-named columns for cross-table analysis
const merged = S.columns('combined', 'score'); // control_score, treat_score
const res = merged.compareMeans('control_score','treat_score').independentWelch();

How‑to recipes

Compute cross-table correlation between before_score and after_score

const merged = S.columns('ab', 'score');
merged.correlate('before_score','after_score').pearson();

Build a summary sheet for multiple tables (mean, sd, n)

const names = S.colNames();
const rows = names.map(col => {
  const t = S.columns('tmp', col);
  const d = t.describe(`${col}_0`); // first
  return { col, mean: d.mean, sd: d.stdDevSample, n: d.n };
});

Live CodePen demos: add your links here.

Practical patterns

A. Pipeline “sort → split → test”

import { Table } from 'als-statistics';
import { Analyze } from 'als-statistics';
const { CompareMeans } = Analyze;

// sort by score, keep top 100 rows, split by gender, compare means
const t = new Table(data).sortBy('score', false);
const top = t.clone('Top').filterRows([...Array(100).keys()]); // keep first 100 indices
const groups = top.splitBy('gender'); // returns small structure per group

const cm = new CompareMeans(groups);
const res = cm.independentWelch('0','1');
console.log(res.p < 0.05 ? 'Significant' : 'NS');

B. Correlations with filters

import { Table } from 'als-statistics';

const t = new Table(data);
t.filterRowsBy('age', a => a >= 25 && a <= 40);
const corr = t.correlate('height','weight').pearson();
console.log(corr.r, corr.p);

C. Quick reliability check

import { Analyze } from 'als-statistics';

const items = { Q1: [...], Q2: [...], Q3: [...], Q4: [...] };
const alpha = new Analyze.Correlate.CronbachAlpha(items);
console.log(alpha.alpha, alpha.htmlTable);

D. Minimal regression report

import { Analyze } from 'als-statistics';
const reg = new Analyze.Regression(dataset, { yName: 'y', xNames: ['x1','x2'], type: 'linear' });
// step 1
reg.steps[0].calculate();
console.log(reg.steps[0].result); // table-like object for reporting

Analyze · CDF

Cumulative distribution functions used by other tests.

Exports

CDF.regularizedIncompleteBeta(x, a, b): number – Regularized incomplete beta Iₓ(a,b). Clamps to [0,1] when x≤0 or x≥1.
CDF.t(x, df): number – CDF of the Student t distribution. df must be positive.
CDF.f(x, df1, df2): number – CDF of the F distribution. df1, df2 must be positive.
CDF.phi(x): number – Standard normal CDF Φ(x). Returns 0/1 for large negative/positive tails and supports ±Infinity.

Analyze · Compare Means

Entry-point wrapper CompareMeans for mean-comparison tests (t-tests, ANOVA).

Class: `CompareMeans`

Constructor

new CompareMeans(data: Record<string, number[]>)

data – object mapping group name → numeric array.

Methods

paired(...colNames): PairedTTest – paired t-test on two named columns; trims to equal length.
independent(...colNames): IndependentTTest – two-sample Student t-test (pooled variance).
independentWelch(...colNames): IndependentTTest – two-sample Welch t-test.
anova(...colNames): OneWayAnova – one‑way ANOVA (pooled/“classic”).
anovaWelch(...colNames): OneWayAnova – Welch’s one‑way ANOVA.
oneSample(colName?, mu0=0): OneSampleTTest – one-sample t‑test for a single column (defaults to the first key if colName omitted).

All methods accept optional column names. If omitted, the test uses all keys from the constructor data.

One‑Way ANOVA

Classic (pooled) and Welch’s ANOVA.

Class: `OneWayAnova` (returned by `CompareMeans.anova` / `anovaWelch`)

Constructor

new OneWayAnova(data: Record<string, number[]>, welch=false)

data – group → values.
Set welch=true for Welch ANOVA.

Public fields

F: number
dfBetween: number
dfWithin: number
p: number – right‑tail p‑value via F CDF.
k: number – number of groups.
msw: number – mean square within.

Independent Samples T‑Test

Two-sample t‑test. Supports Student (pooled) and Welch variants.

Class: `IndependentTTest` (returned by `CompareMeans.independent` / `independentWelch`)

Constructor

new IndependentTTest({ g1: number[], g2: number[] }, welch=false)

Set welch=true for Welch’s unequal-variance t‑test.

Public fields

t: number – t statistic.
df: number – degrees of freedom (Welch uses Satterthwaite).
p: number – two‑sided p‑value (getter).
F: number – ANOVA-equivalent t² (getter).
leveneF: number – Levene’s F for equality of variances.
leveneDf1: number, leveneDf2: number, leveneP: number – Levene’s test details.
k: number – number of groups (always 2 here).

One-Sample T‑Test

Class: `OneSampleTTest` (returned by `CompareMeans.oneSample`)

Constructor

new OneSampleTTest({ X: number[] }, mu0=0)

Requires n ≥ 2.

Public fields

t: number – t statistic.
df: number – n - 1.
p: number – two‑sided p‑value.
sd: number – sample standard deviation.
se: number – standard error sd / sqrt(n).
mu0: number – hypothesized mean.

Paired T‑Test

Paired (dependent) samples t‑test.

Class: `PairedTTest` (returned by `CompareMeans.paired`)

Constructor

new PairedTTest({ A: number[], B: number[] })

Requires at least two arrays; internally trims to the same length.

Public fields

t: number – t statistic.
df: number – degrees of freedom (n - 1).
p: number – two‑sided p‑value (Student t).
meanDelta: number – mean of pairwise differences.
sdDelta: number – sample SD of differences.
n: number – number of paired observations.
diffs: number[] – raw pairwise deltas (A[i] − B[i]).

Correlate — practical usage

Two columns vs matrix

import { Analyze } from 'als-statistics';
const { Correlate } = Analyze;

// 1) EXACTLY TWO columns → returns a single test instance
const one = new Correlate({ X: [1,2,3], Y: [2,4,9] }).pearson('X', 'Y');
console.log(one.r, one.t, one.df, one.p);

// 2) THREE OR MORE columns → returns a map of pairwise results
const all = new Correlate({ A:[...], B:[...], C:[...] }).pearson();
console.log(Object.keys(all));        // ['A|B','A|C','B|C']
console.log(all['A|B'].r, all['A|B'].p);

Population vs sample (Pearson)

pearson() — uses population covariance in the r-formula.
pearsonSample() — uses sample covariance.
Both provide two-sided p via the t-distribution with df = n - 2.

const p1 = new Correlate(data).pearson();        // population r
const p2 = new Correlate(data).pearsonSample();  // sample r

Spearman & Kendall (ties handled)

const s = new Correlate({ X:[...], Y:[...] }).spearman('X','Y');
const k = new Correlate({ X:[...], Y:[...] }).kendall('X','Y');
console.log(s.r, s.p, k.tau, k.p);

Two-sided helpers: .spearmanTwoSided() и .kendallTwoSided().

Reliability — Cronbach’s alpha

// Option A: import the class directly
import { CronbachAlpha } from 'als-statistics/analyze/correlate/cronbach-alpha.js';

// Option B: via the namespace
import { Analyze } from 'als-statistics';
const { Correlate } = Analyze;
// new Correlate.CronbachAlpha(table)  // same class

const items = { Q1:[...], Q2:[...], Q3:[...] };
const alpha = new CronbachAlpha(items);

console.log(alpha.alpha);          // overall alpha
console.log(alpha.ifItemsDeleted); // { Q1: α_if_deleted, ... }
console.log(alpha.htmlTable);      // ready-to-embed HTML with a small table

Notes:
Correlate methods auto-trim vectors to the shortest length where needed (e.g., Spearman).
Pairwise matrices return a plain object of test instances keyed as 'A|B'.

Cronbach’s Alpha

Class: `CronbachAlpha`

Constructor

new CronbachAlpha(data: Record<string, number[]>)

Requires ≥ 2 parallel scales/items of equal length.

Public fields

alpha: number – reliability estimate.
sumOfVariances: number – sum of item variances (sample).
sumColumnVariance: number – variance of summed score across rows.
bessel: number – correction factor k/(k-1).
ifItemsDeleted: Record<string, number> – getter recomputed lazily.
htmlTable: string – formatted summary table (getter).

Kendall Rank Correlation (τ)

Class: `Kendall`

Constructor

new Kendall({ X: number[], Y: number[] }, twoSided=true)

Public fields

tau: number
z: number – normal approximation for significance
p: number – p‑value (two‑sided by default)
t: number – alias of z (for consistency with other tests)
df: number – Infinity (normal approximation)

Pearson Correlation

Class: `Pearson`

Constructor

new Pearson({ X: number[], Y: number[] }, population=false)

When population=true, covariance uses population denominator.

Public fields

covariance: number
df: number – n - 2
r: number – correlation coefficient in [-1, 1]
t: number – test statistic
p: number – two‑sided p‑value

Spearman Rank Correlation

Class: `Spearman`

Constructor

new Spearman({ X: number[], Y: number[] })

Public fields

r: number – Spearman’s rho
t: number – t‑approximation of significance
p: number – two‑sided p‑value
n: number – number of paired observations (shorter input is trimmed)

Analyze · Clustering

Density-based clustering over columns using precomputed distances between series.

Class: `Dbscan`

Constructor

new Dbscan(data: Record<string, number[]>, options?: { eps?: number, minPts?: number, metric?: 'mad' })

eps (default 0.4), minPts (default 3), metric (default 'mad').

Public fields

metric: string
eps: number · minPts: number
labels: number[] – 0 unvisited, -1 noise, 1.. cluster id per column.
clusters: Array<{ id:number, columns:string[] }> – built by buildClusters.
distances: number[][] – symmetric distance matrix.
Core methods (invoked by constructor): findNeighbors(i), expandCluster(i, clusterId), run().

Class: `Hdbscan`

Constructor

new Hdbscan(data: Record<string, number[]>, options?: { metric?: 'mad', minClusterSize?: number })

minClusterSize defaults to 2.

Public fields

metric: string, minClusterSize: number
labels: number[] – final labels per column.
clusters: Array<{ id:number, columns:string[] }>
mreachDistances: number[][] – mutual reachability distances.
mst: Array<[i,j,weight]> – minimum spanning tree.
hierarchy: Array<{ clusterId, lambdaBirth, lambdaDeath, points, size, children }>

Regression — practical usage

The Regression wrapper builds a sequence of models (steps). Start with a baseline, then call next([...]) to add more predictors. Interaction terms are supported via the 'X*Z' notation.

new Regression(data, { yName: string, xNames?: string[], type?: 'linear'|'logistic' })
reg.next(newPredictors: string[]): this

reg.steps: Array<RegressionBase>   // each step is a fitted model
reg.results: Array<Record<string, any>> // array of .result from each step
reg.htmlTables: string             // combined HTML of all steps

A) Linear — baseline, then moderator (interaction)

import { Analyze } from 'als-statistics';
const { Regression } = Analyze;

const data = { X:[1,2,3,4,5], Z:[0,1,0,1,0], Y:[2,3,6,7,10] };

// Step 0: Y ~ X
const reg = new Regression(data, { yName:'Y', xNames:['X'], type:'linear' });

// Step 1: add moderator Z and interaction X*Z
reg.next(['Z', 'X*Z']);

const step0 = reg.steps[0].result;  // { step, n, Variable[], Coefficient[], StdError[], pValue[] }
const step1 = reg.steps[1].result;  // includes the 'X*Z' row
console.log(step1.Variable.includes('X*Z')); // true

B) “Mediator‐like” step (add M, compare steps)

There’s no built-in mediation test (Sobel/bootstrapping).
However, you can model a putative mediator by adding it as a predictor on the next step and comparing coefficients/R².

const data = { X:[1,2,3,4,5,6], M:[2,4,5,7,7,9], Y:[3,5,7,9,10,13] };

// Step 0: Y ~ X
const reg = new Regression(data, { yName:'Y', xNames:['X'], type:'linear' });

// Step 1: Y ~ X + M
reg.next(['M']);

console.log(reg.steps[0].r2, reg.steps[1].r2);        // change in R²
console.log(reg.steps[1].result.Variable.includes('M')); // true

C) Logistic — classification with accuracy

const data = { X:[0,1,2,3,4], Y:[0,0,0,1,1] };
const logit = new Regression(data, { yName:'Y', xNames:['X'], type:'logistic' });

const s0 = logit.steps[0];
console.log(s0.result.Accuracy);         // in [0,1]
console.log(s0.predict(s0.X));           // -> [0/1,...]
console.log(s0.predictProba(s0.X));      // -> probabilities in [0,1]

Notes & tips

If you omit xNames, the wrapper uses all columns except yName as predictors.
next([...]) creates a clone of the previous step’s columns and (if a name contains '*') generates the interaction term by multiplying the two source predictors element-wise.
Linear steps expose StdError[] and pValue[]. Logistic steps expose Accuracy.
The wrapper and cores are deterministic for the same inputs.

Linear Regression (Core)

Class: `Regression.LinearRegression`

Constructor

new Regression.LinearRegression(table: Record<string, number[]>, yName: string, xNames: string[], stepIndex: number)

Public fields / getters

coefficients: number[] – [Intercept, β1, …].
y: number[], X: number[][], yHat: number[]
residuals: number[]
r2: number
standardErrors: number[]
pValues: number[]
n: number, k: number (obs & parameters)
result: { step, n, Variable, Coefficient, StdError, pValue }
htmlTable: string

Methods

calculate(): this
predict(X: number[][]): number[]

Logistic Regression (Core)

Class: `Regression.LogisticRegression`

Constructor

new Regression.LogisticRegression(table: Record<string, number[]>, yName: string, xNames: string[], stepIndex: number, learningRate=0.01, epochs=1000)

Public fields / getters

coefficients: number[] – [Intercept, β1, …]
y: number[], X: number[][], yHat: number[] (predicted classes)
accuracy: number
n: number, k: number
result: { step, n, Variable, Coefficient, Accuracy }
htmlTable: string

Methods

calculate(): this
predictProba(X: number[][]): number[] – probabilities via sigmoid.
predict(X: number[][], threshold=0.5): number[] – hard labels.

Analyze — overview & patterns

This section ties together the shortcuts across Table, Statistics, and Analyze.

From Table to analysis

import { Table } from 'als-statistics';
import { Analyze } from 'als-statistics';

const t = new Table(data, { name: 'Survey' });

// Correlation (single pair)
const r1 = t.correlate('height','weight').pearson();

// Correlation matrix (3+ columns)
const rAll = t.correlate('height','weight','age').pearson();

// Compare means (Welch, unequal variances)
const w  = t.compareMeans('groupA','groupB').independentWelch();

// One-way ANOVA (classic/Welch)
const a1 = t.compareMeans('A','B','C').anova();
const aW = t.compareMeans('A','B','C').anovaWelch();

// Regression (linear/logistic)
const lin = new Analyze.Regression(t.columns, { yName:'score', xNames:['age','hours'] });
const log = new Analyze.Regression(t.columns, { yName:'passed', xNames:['score'], type:'logistic' });

Split → Combine (Statistics) → Analyze

// Split one table by a factor (returns Statistics with per-group tables)
const S = t.splitBy('group', { 0:'ctrl', 1:'treat' });

// Combine the same column across all split tables into one Table
const merged = S.columns('byGroup', 'score'); // -> ctrl_score, treat_score

// Now analyze as usual
const test = merged.compareMeans('ctrl_score','treat_score').independentWelch();

Keep mutations API-only (addRow, setAt, splice, values=). Avoid in-place array edits to preserve caches and consistent results.

Descriptive Statistics

Static utility functions used across the library. These are also mixed into Column (arity‑1 functions as getters; others as methods).

Selected functions

sum(values) · mean(values) · median(values) · mode(values)
variance(values) · varianceSample(values) · stdDev(values) · stdDevSample(values) · cv(values)
min(values) · max(values) · range(values) · iqr(values) · mad(values)
zScore(values, v) · zScores(values) · zScoresSorted(values)
percentile(values, p) · q1(values) · q3(values) · p10(values) · p90(values)
weightedMean(values, weights)
confidenceInterval({ mean, stdDevSample, values })
outliersZScore(values, z=3) · outliersIQR(values)
slope({ values }) · regressionSlope({ X, Y })
spectralPowerDensityArray(values) · spectralPowerDensityMetric(values)

Refer to JSDoc in code for exact parameter objects where applicable.

Utils

General helper utilities.

Functions

htmlTable(rows, headers, options?) → string – render a simple HTML table (escapes content; supports firstColHeader, fixed decimals, transposition).
round(value, fixed=8) → number | string – numeric rounding with fixed decimals.
range(start=0, end, step=1) → number[] – numeric range.
filterKeys(keys: string[], filters: (string|number|RegExp)[]): string[] – include names, regex filter, and '-name' exclusions.
Counter – simple name counter with getName(name?).

EPS (Golden Test Tolerances)

| Class | Key | Value | Notes | |----------------|------------|---------|----------------------------------------| | Descriptives | stat | 1e-6 | Means, medians, quantiles, variance | | Z-scores | z | 1e-9 | Summary mean/std of z | | Regression | reg | 1e-5 | Coefficients, metrics | | CDF | cdf | 1e-9 | CDF/PPF checks | | Correlations | r | 1e-7 | Pearson/Spearman/Kendall | | Degrees of Fr. | df | 1e-6 | Welch df (float) | | ANOVA F | anovaF | 1e-6 | | | Flatness | flatness | 1e-12 | GM/AM stability | | SPD Flatness | spd | 1e-12 | GM/AM on SPD | | p-values | p | 1e-6 | |

Change these in goldens/settings.js if needed.

How‑to

Split a table by predicate and compare groups (Welch):

const { A, B } = Table.split(raw, r => r.group === 'A' ? 'A' : 'B');
const t = A.compareMeans('score','B.score').independentWelch();

Detect z‑outliers and keep sorted indices:

const { zScores, indexes } = Stats.zScoresSorted({ values });
const top3 = indexes.slice(-3); // largest |z|

Compute spectral flatness of a spectrum:

const spd = Stats.spectralPowerDensityArray({ values: magnitudes });
const flat = Stats.spectralPowerDensityMetric({ spectralPowerDensityArray: spd, values: magnitudes });

Live CodePen demos: add your links here.

Changelog [2.1.0] - 2025-09-04

Breaking change: als-statistics v2 is a ground-up rewrite with no backward compatibility with v1.x.

If you rely on v1: pin your dependency to the latest 1.x release.

npm i als-statistics@^1

Changed

Stats.harmonicMean(...) — inputs ≤ 0 are now clamped to ε (1e-12) before computation (aligns with Python goldens), preventing NaN/division-by-zero surprises.
Stats.zScores({ values }, sample = false) — added a second parameter:
- sample = false (default): population std (ddof = 0) — backward-compatible.
- sample = true: sample std (ddof = 1) — matches NumPy/SciPy z-scores and golden summaries.
Stats.flatness({ values }) — now returns 0 when the arithmetic mean is 0 (previously NaN), making all-zero vectors well-defined.

Fixed

Stats.mad(...) — corrected median absolute deviation for edge cases.

Tests

Added golden cross-checks against Python (NumPy/SciPy) and HDBSCAN labels; all pass within documented EPS tolerances.

Notes: Default behavior remains the same for zScores (ddof=0) unless sample=true is provided. If your code relied on NaN from flatness/harmonicMean for zero/negative inputs, update downstream checks accordingly.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

ALS Statistics

Why the rewrite?

Key ideas

Installation

Usage in browser

NodeJS

Quick starts

1) Use it like Math (one-liners)

2) Quick analysis: correlation in one line

3) Compare means: Welch t-test (unequal variances)

4) One-way ANOVA (classic & Welch)

5) Table-first workflow (filter → split → analyze)

Data managing (Tables and Columns)

Notes & pitfalls

Column

Quick API snapshot

How it works (principles)

Creating and validating

Safe mutations (cache-aware)

Caching your own computations

Descriptives on Column

Table

Quick API snapshot

Constructing and alignment

Rows & columns (synchronization)

Data shaping

Split & analyze

Transpose and clone

Statistics (multi-table manager)

API

Column selection (colFilter)

Examples

1) Before/After (paired)

2) Split → Combine → Independent Welch

3) Cross-table correlation

Scenarios

A) Before/After (pre→post) in separate tables

B) Split → Combine workflow

How‑to recipes

Practical patterns

A. Pipeline “sort → split → test”

B. Correlations with filters

C. Quick reliability check

D. Minimal regression report

Analyze · CDF

Exports

Analyze · Compare Means

Class: CompareMeans

Methods

One‑Way ANOVA

Class: OneWayAnova (returned by CompareMeans.anova / anovaWelch)

Public fields

Independent Samples T‑Test

Class: IndependentTTest (returned by CompareMeans.independent / independentWelch)

Public fields

One-Sample T‑Test

Class: OneSampleTTest (returned by CompareMeans.oneSample)

Public fields

Paired T‑Test

Class: PairedTTest (returned by CompareMeans.paired)

Public fields

Correlate — practical usage

Two columns vs matrix

Population vs sample (Pearson)

Spearman & Kendall (ties handled)

Reliability — Cronbach’s alpha

Cronbach’s Alpha

Class: CronbachAlpha

Public fields

Kendall Rank Correlation (τ)

Class: Kendall

Public fields

Pearson Correlation

Class: Pearson

1) Use it like `Math` (one-liners)

Column selection (`colFilter`)

Class: `CompareMeans`

Class: `OneWayAnova` (returned by `CompareMeans.anova` / `anovaWelch`)

Class: `IndependentTTest` (returned by `CompareMeans.independent` / `independentWelch`)

Class: `OneSampleTTest` (returned by `CompareMeans.oneSample`)

Class: `PairedTTest` (returned by `CompareMeans.paired`)

Class: `CronbachAlpha`

Class: `Kendall`

Class: `Pearson`

Class: `Spearman`

Class: `Dbscan`

Class: `Hdbscan`

Class: `Regression.LinearRegression`

Class: `Regression.LogisticRegression`