als-statistics
v2.1.0
Published
Modular JS statistics toolkit for Node.js and the browser: descriptive stats, correlations (Pearson/Spearman/Kendall), t-tests & ANOVA (Student/Welch), reliability (Cronbach’s alpha), regression (linear/logistic), clustering (DBSCAN/HDBSCAN), and table/co
Maintainers
Readme
ALS Statistics
ALS Statistics is a modular JS toolkit for statistical work. It’s designed to be:
- Quality: Numerics verified: this release matches Python (NumPy/SciPy) reference outputs across modules and passes the deterministic Golden Test Suite on Node.js 20.x, all within published EPS tolerances. Reproducible via
node goldens/test.jsandnpm test. - Easy to use like
Mathfor small one-liners; - Composable for multi-step analyses (filter → group → compare → summarize);
- Runtime-agnostic — the same API in Node.js and in the browser;
- Data-model light — works with plain arrays (
number[]) and small helpers likeColumnandTable. - Browser-ready. No native dependencies; works in the browser (as ESM or via the included UMD bundle). Think of it as a “batteries-included” stats toolbox rather than a full data-frame ecosystem. If you know SPSS: ALS gives you many of the common procedures (correlations, t-tests, ANOVA, reliability, basic clustering, regression) with code-first ergonomics. If you know NumPy/SciPy: ALS focuses on analytics primitives and convenience wrappers (no heavy data containers, no plotting).
Why the rewrite?
The v1 architecture had grown too complex (intertwined modules, heavy abstractions), which made adding features and maintaining consistency difficult.
v2 was rebuilt from scratch with a simpler core (plain arrays + lightweight Column/Table), clear module boundaries, and predictable numerics—so new analytical tools can be added quickly without increasing complexity.
Key ideas
- Plain data in / plain results out. Most functions take
{ [name]: number[] }ornumber[]and return simple objects (e.g.{ r, t, df, p }). - Two modes of use:
- One-liners via descriptive helpers (mean, stdDev, percentiles…).
- Structured analyzers for correlations, mean comparisons, regressions, clustering, etc.
- Table utilities. Sort, filter, split by group, compute derived columns, and feed the result to an analyzer.
Installation
npm i als-statisticsUsage in browser
<script type="module" src="/node_modules/als-statistics/lib/index.js"></script>
or
<script src="/node_modules/als-statistics/statistics.js"></script>
or
<script type="module">
import Statistics from '/node_modules/als-statistics/lib/index.js'
</script>NodeJS
import { Analyze, Stats, Table, Column } from 'als-statistics';
// or
const { Analyze, Stats, Table, Column } = require('als-statistics/statistics.cjs')
const { CDF, CompareMeans, Correlate, Clustering, Regression } = Analyze;
const { constants, t, f, phi } = CDF;
const { IndependentTTest, OneWayAnova, PairedTTest, OneSampleTTest } = CompareMeans;
const { CronbachAlpha, Pearson, Spearman, Kendall } = Correlate;
const { Dbscan, Hdbscan, computeDistances } = Clustering;
const { LinearRegression, LogisticRegression } = Regression;
// Descriptive stats (one-liners)
const {
sum, mean,median,mode, min, max, // central tendency
variance, varianceSample, stdDev, stdDevSample, cv, range, iqr, mad, // dispersion & scale
percentile, q1, q3, p10, p90, // position & percentiles
zScore, zScores, zScoresSorted, outliersZScore, outliersIQR, // z-scores & outliers
weightedMean, confidenceInterval, slope, regressionSlope, // misc
spectralPowerDensityArray, spectralPowerDensityMetric,
sorted, ma, sumOfSquares, flatness, skewness, kurtosis, // other statistics
skewnessSample, kurtosisSample, geometricMean, harmonicMean,
noiseStability, frequencies, relativeFrequencies,
relativeDispersion, normalizedValues, xValues,
recode, // recode values
} = Stats;The package is modular — import only what you use.
Quick starts
1) Use it like Math (one-liners)
import { Stats } from 'als-statistics';
const X = [10, 12, 13, 9, 14];
const mu = Stats.mean(X);
const sd = Stats.stdDevSample(X);
const p90 = Stats.p90(X);
console.log({ mu, sd, p90 });
// → { mu: 11.6, sd: 1.923..., p90: 13.8 }You can also access many metrics via Column:
import { Column } from 'als-statistics';
const col = new Column([10, 12, 13, 9, 14], 'Score');
const { mean, stdDev, median, frequencies, flatness } = col;2) Quick analysis: correlation in one line
import { Analyze } from 'als-statistics';
const data = {
gender: [0, 1, 0, 1, 1, 0], // 0=female, 1=male
score: [62, 75, 70, 81, 64, 78],
};
const pearson = new Analyze.Correlate(data).pearson('gender', 'score');
const { r, t, df, p } = pearson;
console.log({ r, t, df, p });
// r in [-1, 1], two-sided p-value in [0, 1]3) Compare means: Welch t-test (unequal variances)
import { Analyze } from 'als-statistics';
const data = {
men: [62, 75, 70, 81, 64],
women: [78, 73, 69, 71, 74, 77],
};
const test = new Analyze.CompareMeans(data).independentWelch('men', 'women');
console.log({ t: test.t, df: test.df, p: test.p });4) One-way ANOVA (classic & Welch)
import { Analyze } from 'als-statistics';
const { CompareMeans } = Analyze;
const data = {
A: [10, 11, 9, 10],
B: [10, 30, -10, 50, -20],
C: [12, 13, 12, 11, 14],
};
const classic = new CompareMeans(data).anova(); // pooled (equal variances)
const welch = new CompareMeans(data).anovaWelch(); // unequal variances
console.log({
classic: { F: classic.F, df1: classic.dfBetween, df2: classic.dfWithin, p: classic.p },
welch: { F: welch.F, df1: welch.dfBetween, df2: welch.dfWithin, p: welch.p },
});5) Table-first workflow (filter → split → analyze)
import { Table } from 'als-statistics';
const t = new Table(
{ gender: [0,1,0,1,1,0], age: [21,22,20,23,19,22], score: [62,75,70,81,64,78] },
{ name: 'Survey' }
);
// Keep adults 21+
t.filterRowsBy('age', a => a >= 21);
// Compare score by gender with Welch
// Option A: already split into columns:
import { Analyze } from 'als-statistics';
const { CompareMeans } = Analyze;
const cm = new CompareMeans({ men: [...], women: [...] }).independentWelch('men', 'women');
// Option B: split first, then pass to CompareMeans:
const groups = t.splitBy('gender'); // returns { groupName: number[] }
const test = new CompareMeans(groups).independentWelch('0', '1');Data managing (Tables and Columns)
This section explains how data flows through Columns, Tables and Statistics: validation rules, caching, safe updates, and the most common operations you’ll use before running analytics.
Notes & pitfalls
- Always mutate via API. Use
Columnmutators or thevaluessetter; avoid direct array mutation to keep caches correct. - Invalids.
Column.invalidstores indices of rejected values; descriptives and analyses ignore them. - Mutability. Most
Tablemethods are in-place and returnthis. Preferclone()when you need a safe branch. - Alignment. If you disable alignment and keep ragged columns, be mindful when exporting rows or running analyses that expect equal lengths.
- HTML output.
htmlTable()is for quick previews; for full reports, prefer exporting rows and rendering via your own templates.
Column
Quick API snapshot
import Statistics ,{ Table, Column } from 'als-statistics';
// Column
// static
Column.key(name, ...parts)
// properties/getters
col.name
col.labels? // optional labels aligned with values
col.invalid // indices of invalid inputs
col.values // get/set (validated)
col.n // length
// cache/events
col.$(key, compute) // memoize custom computations
col.onchange(fn) // subscribe to structural changes
// mutation helpers (invalidate caches automatically)
col.addValue(value, index?)
col.deleteValue(index)
col.clone(name?)
col.insertAt(index, ...items)
col.setAt(index, item)
col.removeAt(index, deleteCount=1)
col.splice(start, deleteCount, ...items)
col.push(...items)
// descriptive on Column (same names as Stats one-liners)
col.sum, col.mean, col.median, col.mode
col.variance, col.varianceSample, col.stdDev, col.stdDevSample, col.cv, col.range, col.iqr, col.mad
col.percentile(p), col.q1, col.q3, col.p10, col.p90
col.zScore(v), col.zScores(), col.zScoresSorted(), col.outliersZScore(z=3), col.outliersIQR()
col.weightedMean(weights), col.confidenceInterval, col.slope, col.regressionSlope(customX)
col.spectralPowerDensityArray, col.spectralPowerDensityMetricHow it works (principles)
- Validation-first. Columns accept only finite numbers. Any non-finite input (
NaN,±Infinity, non-number) is rejected or tracked viacol.invalid, and excluded from descriptive metrics. - Cached results. Many results are cached (e.g.,
col.mean,col.stdDev). To keep caches correct, you must not mutate the underlying array directly.
Instead, either:- assign a new array via the validated setter:
col.values = [...newNumbers], or - use the provided mutators (
setAt,splice,push, …).
These paths automatically invalidate caches and fireonchangeevents.
- assign a new array via the validated setter:
- Alignment in tables. By default, a
Tablealigns columns to a common length (truncates to the shortest column). You can change this behavior with constructor options (e.g.,alignColumns: false,minK) or callt.alignColumns()explicitly. - In-place transforms. Most
Tablemethods mutate. Chain them freely, or useclone()to keep the original around.
Creating and validating
import { Column } from 'als-statistics';
const scores = new Column([10, 12, 13, 9, 14], 'Score');
// set a new validated series (replaces data, clears caches)
scores.values = [11, 11, 10, 12, 15];
// invalid values are tracked and excluded from stats
scores.values = [11, 12, NaN, 10, 9, Infinity];
console.log(scores.invalid); // [2, 5]
console.log(scores.mean); // mean over valid entries onlyDo not mutate
scores.valuesin place (e.g.,scores.values[0] = 999), as caches won’t know about it. UsesetAt(...)instead.
Safe mutations (cache-aware)
// append values
scores.push(10, 11);
// insert at position
scores.insertAt(1, 99);
// replace a single value
scores.setAt(0, 12);
// delete & splice
scores.deleteValue(2);
scores.splice(3, 1, 50, 51);All of these invalidate caches and emit onchange:
scores.onchange((col, prev, meta) => {
console.log('column changed:', meta.type)
});Caching your own computations
// memoize expensive custom metric
const kurt = scores.$('kurtosis', () => {
// compute once, then served from cache until data changes
return scores.kurtosis; // or any custom formula
});Descriptives on Column
Every descriptive method available in Stats exists on Column too and always respects validation/caching:
console.log({
mean: scores.mean,
sd : scores.stdDevSample,
q1 : scores.q1,
p90 : scores.p90,
outliersZ: scores.outliersZScore(3)
});Table
Quick API snapshot
import { Table } from 'als-statistics';
const t = new Table(data?, { name?, minK?, alignColumns? })
// properties/getters
t.n // rows count
t.k // columns count
t.columns // map of Column
t.colNames // string[]
t.colValues // Record<string, number[]>
t.json // plain object view
// row/column transforms (in-place; use clone() to branch)
t.addColumn(name, values, labels?) -> Column
t.deleteColumn(name) -> this
t.addRow(row, index?) -> this
t.addRows(rows, index?) -> this
t.deleteRow(index) -> this
t.alignColumns() -> this
// data shaping
t.recode(colName, mapper, newColName?) -> void
t.compute(fn, name) -> Column
t.filterRows(indexes) -> this
t.filterRowsBy(colName, predicate) -> this
t.sortBy(colName, asc=true) -> this
t.clone(name?, colFilter=[]) -> Table
t.splitBy(colName, labels?) -> Statistics
t.transpose(colNames=[]) -> Table
t.where(rowPredicate) -> number[]
t.rows(withKeys=true) -> object[] | any[][]
t.htmlTable(colFilter=[], options?) -> string
t.descriptive(...metricNames) -> Object{} // Descriptive statistics for all columns
// analysis shortcuts
t.correlate(...colFilter) -> Correlate
t.compareMeans(...colFilter) -> CompareMeans
t.dbscan(colFilter, options?) -> Dbscan
t.hdbscan(colFilter, options?) -> Hdbscan
t.regression(yName, xNames, type='linear'|'logistic') -> Regression
t.linear(yName, xNames)
t.logistic(yName, xNames)
Tip: operations on
Tableare mutable by default (they change the same instance). Uset.clone(...)to branch a copy for “what-if” scenarios.
Constructing and alignment
import { Table } from 'als-statistics';
const t = new Table(
{ gender: [0,1,0,1,1,0], age: [21,22,20,23,19], score: [62,75,70,81,64,78] },
{ name: 'Survey', alignColumns: true, minK: 2 }
);
// When alignColumns=true (default), columns are trimmed to the shortest length.
// You can turn this off via { alignColumns: false } if you need ragged columns.
console.log(t.n, t.k, t.colNames); // rows, columns, names
// Access Column objects
const scoreCol = t.columns['score'];
console.log(scoreCol.mean);Rows & columns (synchronization)
// add/delete columns
t.addColumn('bmi', [22.1, 24.0, 23.7, 25.3, 21.8]);
t.deleteColumn('age');
// add rows (object keys match column names)
t.addRow({ gender: 0, score: 71, bmi: 23.1 });
t.addRows([
{ gender: 1, score: 68, bmi: 24.2 },
{ gender: 0, score: 77, bmi: 22.7 }
]);
// delete rows
t.deleteRow(0);
// re-align explicitly if needed
t.alignColumns();Data shaping
// recode values (e.g., 0/1 -> 'F'/'M'), optionally write to a new column
t.recode('gender', g => (g === 0 ? 'F' : 'M'), 'genderLabel');
// compute a derived numeric column
t.compute(row => row.score / (row.bmi ?? 1), 'scorePerBmi');
// filter & sort (in place)
t.filterRowsBy('score', s => s >= 70);
t.sortBy('score', /*asc=*/false);
// pick rows by predicate (returns indices)
const adultIdx = t.where(row => row.bmi >= 22 && row.bmi <= 25);
// grab data in different shapes
const rowsAsObjects = t.rows(true);
const rowsAsArrays = t.rows(false);
const html = t.htmlTable(['genderLabel','score','bmi']);Split & analyze
// split one column into groups, then run an analysis
const groups = t.splitBy('genderLabel'); // => { F: number[], M: number[] }
import { Analyze } from 'als-statistics';
const test = new Analyze.CompareMeans(groups).independentWelch('F', 'M');
console.log({ t: test.t, df: test.df, p: test.p });
// or use shortcuts directly from Table
const corr = t.correlate('score','bmi').pearson();
console.log({ r: corr.r, p: corr.p });Transpose and clone
// transpose a subset of columns (handy for certain distance/clustering operations)
const t2 = t.transpose(['score','bmi']);
// clone to branch a scenario without touching the original
const tClone = t.clone('scenario: filtered', ['score','bmi']);Statistics (multi-table manager)
Statistics is a lightweight coordinator for multiple Tables. It lets you:
- register tables (
addTable), - compute the union of available column names (
colNames), - combine the same columns from different tables into a new
Table(columns(...)), - remove tables (
deleteTable), - and access the module namespace (static):
Statistics.Table,Statistics.Stats,Statistics.Analyze,Statistics.Column.
It’s especially handy for before/after designs, or when you split one table by a factor and then want to analyze the resulting groups together.
API
new Statistics(name?: string)
statistics.addTable(obj: Record<string, number[]>, options?: { name?: string, minK?: number, alignColumns?: boolean }): Table
statistics.deleteTable(tableName: string): void
// set of distinct column names across all registered tables
statistics.colNames: string[]
// Combine selected columns (from *every* table that has them) into a new Table.
// Result columns are named `${tableName}_${colName}`.
statistics.columns(name: string, ...colFilter: (string|RegExp)[]): Table
// Static accessors (namespaces)
Statistics.Table
Statistics.Stats
Statistics.Analyze
Statistics.ColumnColumn selection (colFilter)
columns(name, ...colFilter) uses the same filtering helper as Table:
- pass exact names:
columns('X', 'score') - pass regex:
columns('X', /^score|age$/) - exclude by prefixing with
-:columns('X', 'score', '-score_z')
Examples
1) Before/After (paired)
import Statistics from 'als-statistics';
const { CompareMeans } = Statistics.Analyze;
const S = new Statistics('A/B');
// register two tables with the same column name "score"
S.addTable({ score: [62, 71, 69, 73, 75] }, { name: 'before' });
S.addTable({ score: [70, 76, 70, 78, 79] }, { name: 'after' });
// collect score columns from all tables into one Table
const merged = S.columns('Scores', 'score'); // -> columns: before_score, after_score
// run paired t-test using the Table shortcut
const paired = merged.compareMeans('before_score', 'after_score').paired();
console.log({ t: paired.t, df: paired.df, p: paired.p });2) Split → Combine → Independent Welch
import { Table } from 'als-statistics';
import Statistics from 'als-statistics';
const { CompareMeans } = Statistics.Analyze;
const t = new Table(
{ group: [0,1,0,1,0,1], score: [62,75,70,81,64,78] },
{ name: 'Survey' }
);
// split by "group" → returns a Statistics instance with one table per group
const S = t.splitBy('group', { 0: 'control', 1: 'treat' });
// bring the "score" columns from each split table into ONE Table
const merged = S.columns('scored', 'score'); // control_score, treat_score
const test = merged.compareMeans('control_score','treat_score').independentWelch();
console.log({ t: test.t, df: test.df, p: test.p });3) Cross-table correlation
const merged = S.columns('ab', 'score'); // e.g., before_score, after_score
const corr = merged.correlate('before_score','after_score').pearson();
console.log({ r: corr.r, p: corr.p });Scenarios
A) Before/After (pre→post) in separate tables
import Statistics from 'als-statistics';
const S = new Statistics();
S.addTable('pre', preTable);
S.addTable('post', postTable);
// Merge the same column name from multiple tables
const merged = S.columns('merged', 'score'); // pre_score, post_score
const cm = merged.compareMeans('pre_score','post_score').paired();
console.log({ t: cm.t, df: cm.df, p: cm.p });B) Split → Combine workflow
const S = new Statistics();
S.addTable('raw', rawTable);
// Split by factor into two new tables
const { control, treat } = S.split('raw', by => by.group === 'A' ? 'control' : 'treat');
// Combine same-named columns for cross-table analysis
const merged = S.columns('combined', 'score'); // control_score, treat_score
const res = merged.compareMeans('control_score','treat_score').independentWelch();How‑to recipes
- Compute cross-table correlation between
before_scoreandafter_scoreconst merged = S.columns('ab', 'score'); merged.correlate('before_score','after_score').pearson(); - Build a summary sheet for multiple tables (mean, sd, n)
const names = S.colNames(); const rows = names.map(col => { const t = S.columns('tmp', col); const d = t.describe(`${col}_0`); // first return { col, mean: d.mean, sd: d.stdDevSample, n: d.n }; });
Live CodePen demos: add your links here.
Practical patterns
A. Pipeline “sort → split → test”
import { Table } from 'als-statistics';
import { Analyze } from 'als-statistics';
const { CompareMeans } = Analyze;
// sort by score, keep top 100 rows, split by gender, compare means
const t = new Table(data).sortBy('score', false);
const top = t.clone('Top').filterRows([...Array(100).keys()]); // keep first 100 indices
const groups = top.splitBy('gender'); // returns small structure per group
const cm = new CompareMeans(groups);
const res = cm.independentWelch('0','1');
console.log(res.p < 0.05 ? 'Significant' : 'NS');B. Correlations with filters
import { Table } from 'als-statistics';
const t = new Table(data);
t.filterRowsBy('age', a => a >= 25 && a <= 40);
const corr = t.correlate('height','weight').pearson();
console.log(corr.r, corr.p);C. Quick reliability check
import { Analyze } from 'als-statistics';
const items = { Q1: [...], Q2: [...], Q3: [...], Q4: [...] };
const alpha = new Analyze.Correlate.CronbachAlpha(items);
console.log(alpha.alpha, alpha.htmlTable);D. Minimal regression report
import { Analyze } from 'als-statistics';
const reg = new Analyze.Regression(dataset, { yName: 'y', xNames: ['x1','x2'], type: 'linear' });
// step 1
reg.steps[0].calculate();
console.log(reg.steps[0].result); // table-like object for reportingAnalyze · CDF
Cumulative distribution functions used by other tests.
Exports
CDF.regularizedIncompleteBeta(x, a, b): number– Regularized incomplete beta Iₓ(a,b). Clamps to[0,1]whenx≤0orx≥1.CDF.t(x, df): number– CDF of the Student t distribution.dfmust be positive.CDF.f(x, df1, df2): number– CDF of the F distribution.df1, df2must be positive.CDF.phi(x): number– Standard normal CDF Φ(x). Returns0/1for large negative/positive tails and supports±Infinity.
Analyze · Compare Means
Entry-point wrapper CompareMeans for mean-comparison tests (t-tests, ANOVA).
Class: CompareMeans
Constructor
new CompareMeans(data: Record<string, number[]>)data– object mapping group name → numeric array.
Methods
paired(...colNames): PairedTTest– paired t-test on two named columns; trims to equal length.independent(...colNames): IndependentTTest– two-sample Student t-test (pooled variance).independentWelch(...colNames): IndependentTTest– two-sample Welch t-test.anova(...colNames): OneWayAnova– one‑way ANOVA (pooled/“classic”).anovaWelch(...colNames): OneWayAnova– Welch’s one‑way ANOVA.oneSample(colName?, mu0=0): OneSampleTTest– one-sample t‑test for a single column (defaults to the first key ifcolNameomitted).
All methods accept optional column names. If omitted, the test uses all keys from the constructor data.
One‑Way ANOVA
Classic (pooled) and Welch’s ANOVA.
Class: OneWayAnova (returned by CompareMeans.anova / anovaWelch)
Constructor
new OneWayAnova(data: Record<string, number[]>, welch=false)data–group → values.- Set
welch=truefor Welch ANOVA.
Public fields
F: numberdfBetween: numberdfWithin: numberp: number– right‑tail p‑value via F CDF.k: number– number of groups.msw: number– mean square within.
Independent Samples T‑Test
Two-sample t‑test. Supports Student (pooled) and Welch variants.
Class: IndependentTTest (returned by CompareMeans.independent / independentWelch)
Constructor
new IndependentTTest({ g1: number[], g2: number[] }, welch=false)- Set
welch=truefor Welch’s unequal-variance t‑test.
Public fields
t: number– t statistic.df: number– degrees of freedom (Welch uses Satterthwaite).p: number– two‑sided p‑value (getter).F: number– ANOVA-equivalentt²(getter).leveneF: number– Levene’s F for equality of variances.leveneDf1: number,leveneDf2: number,leveneP: number– Levene’s test details.k: number– number of groups (always 2 here).
One-Sample T‑Test
Class: OneSampleTTest (returned by CompareMeans.oneSample)
Constructor
new OneSampleTTest({ X: number[] }, mu0=0)- Requires n ≥ 2.
Public fields
t: number– t statistic.df: number–n - 1.p: number– two‑sided p‑value.sd: number– sample standard deviation.se: number– standard errorsd / sqrt(n).mu0: number– hypothesized mean.
Paired T‑Test
Paired (dependent) samples t‑test.
Class: PairedTTest (returned by CompareMeans.paired)
Constructor
new PairedTTest({ A: number[], B: number[] })- Requires at least two arrays; internally trims to the same length.
Public fields
t: number– t statistic.df: number– degrees of freedom (n - 1).p: number– two‑sided p‑value (Student t).meanDelta: number– mean of pairwise differences.sdDelta: number– sample SD of differences.n: number– number of paired observations.diffs: number[]– raw pairwise deltas (A[i] − B[i]).
Correlate — practical usage
Two columns vs matrix
import { Analyze } from 'als-statistics';
const { Correlate } = Analyze;
// 1) EXACTLY TWO columns → returns a single test instance
const one = new Correlate({ X: [1,2,3], Y: [2,4,9] }).pearson('X', 'Y');
console.log(one.r, one.t, one.df, one.p);
// 2) THREE OR MORE columns → returns a map of pairwise results
const all = new Correlate({ A:[...], B:[...], C:[...] }).pearson();
console.log(Object.keys(all)); // ['A|B','A|C','B|C']
console.log(all['A|B'].r, all['A|B'].p);Population vs sample (Pearson)
pearson()— uses population covariance in the r-formula.pearsonSample()— uses sample covariance.- Both provide two-sided
pvia the t-distribution withdf = n - 2.
const p1 = new Correlate(data).pearson(); // population r
const p2 = new Correlate(data).pearsonSample(); // sample rSpearman & Kendall (ties handled)
const s = new Correlate({ X:[...], Y:[...] }).spearman('X','Y');
const k = new Correlate({ X:[...], Y:[...] }).kendall('X','Y');
console.log(s.r, s.p, k.tau, k.p);Two-sided helpers:
.spearmanTwoSided()и.kendallTwoSided().
Reliability — Cronbach’s alpha
// Option A: import the class directly
import { CronbachAlpha } from 'als-statistics/analyze/correlate/cronbach-alpha.js';
// Option B: via the namespace
import { Analyze } from 'als-statistics';
const { Correlate } = Analyze;
// new Correlate.CronbachAlpha(table) // same class
const items = { Q1:[...], Q2:[...], Q3:[...] };
const alpha = new CronbachAlpha(items);
console.log(alpha.alpha); // overall alpha
console.log(alpha.ifItemsDeleted); // { Q1: α_if_deleted, ... }
console.log(alpha.htmlTable); // ready-to-embed HTML with a small tableNotes:
Correlatemethods auto-trim vectors to the shortest length where needed (e.g., Spearman).- Pairwise matrices return a plain object of test instances keyed as
'A|B'.
Cronbach’s Alpha
Class: CronbachAlpha
Constructor
new CronbachAlpha(data: Record<string, number[]>)- Requires ≥ 2 parallel scales/items of equal length.
Public fields
alpha: number– reliability estimate.sumOfVariances: number– sum of item variances (sample).sumColumnVariance: number– variance of summed score across rows.bessel: number– correction factork/(k-1).ifItemsDeleted: Record<string, number>– getter recomputed lazily.htmlTable: string– formatted summary table (getter).
Kendall Rank Correlation (τ)
Class: Kendall
Constructor
new Kendall({ X: number[], Y: number[] }, twoSided=true)Public fields
tau: numberz: number– normal approximation for significancep: number– p‑value (two‑sided by default)t: number– alias ofz(for consistency with other tests)df: number–Infinity(normal approximation)
Pearson Correlation
Class: Pearson
Constructor
new Pearson({ X: number[], Y: number[] }, population=false)- When
population=true, covariance uses population denominator.
Public fields
covariance: numberdf: number–n - 2r: number– correlation coefficient in[-1, 1]t: number– test statisticp: number– two‑sided p‑value
Spearman Rank Correlation
Class: Spearman
Constructor
new Spearman({ X: number[], Y: number[] })Public fields
r: number– Spearman’s rhot: number– t‑approximation of significancep: number– two‑sided p‑valuen: number– number of paired observations (shorter input is trimmed)
Analyze · Clustering
Density-based clustering over columns using precomputed distances between series.
Class: Dbscan
Constructor
new Dbscan(data: Record<string, number[]>, options?: { eps?: number, minPts?: number, metric?: 'mad' })eps(default0.4),minPts(default3),metric(default'mad').
Public fields
metric: stringeps: number·minPts: numberlabels: number[]–0unvisited,-1noise,1..cluster id per column.clusters: Array<{ id:number, columns:string[] }>– built bybuildClusters.distances: number[][]– symmetric distance matrix.- Core methods (invoked by constructor):
findNeighbors(i),expandCluster(i, clusterId),run().
Class: Hdbscan
Constructor
new Hdbscan(data: Record<string, number[]>, options?: { metric?: 'mad', minClusterSize?: number })minClusterSizedefaults to2.
Public fields
metric: string,minClusterSize: numberlabels: number[]– final labels per column.clusters: Array<{ id:number, columns:string[] }>mreachDistances: number[][]– mutual reachability distances.mst: Array<[i,j,weight]>– minimum spanning tree.hierarchy: Array<{ clusterId, lambdaBirth, lambdaDeath, points, size, children }>
Regression — practical usage
The Regression wrapper builds a sequence of models (steps). Start with a baseline, then call next([...]) to add more predictors. Interaction terms are supported via the 'X*Z' notation.
new Regression(data, { yName: string, xNames?: string[], type?: 'linear'|'logistic' })
reg.next(newPredictors: string[]): this
reg.steps: Array<RegressionBase> // each step is a fitted model
reg.results: Array<Record<string, any>> // array of .result from each step
reg.htmlTables: string // combined HTML of all stepsA) Linear — baseline, then moderator (interaction)
import { Analyze } from 'als-statistics';
const { Regression } = Analyze;
const data = { X:[1,2,3,4,5], Z:[0,1,0,1,0], Y:[2,3,6,7,10] };
// Step 0: Y ~ X
const reg = new Regression(data, { yName:'Y', xNames:['X'], type:'linear' });
// Step 1: add moderator Z and interaction X*Z
reg.next(['Z', 'X*Z']);
const step0 = reg.steps[0].result; // { step, n, Variable[], Coefficient[], StdError[], pValue[] }
const step1 = reg.steps[1].result; // includes the 'X*Z' row
console.log(step1.Variable.includes('X*Z')); // trueB) “Mediator‐like” step (add M, compare steps)
There’s no built-in mediation test (Sobel/bootstrapping).
However, you can model a putative mediator by adding it as a predictor on the next step and comparing coefficients/R².
const data = { X:[1,2,3,4,5,6], M:[2,4,5,7,7,9], Y:[3,5,7,9,10,13] };
// Step 0: Y ~ X
const reg = new Regression(data, { yName:'Y', xNames:['X'], type:'linear' });
// Step 1: Y ~ X + M
reg.next(['M']);
console.log(reg.steps[0].r2, reg.steps[1].r2); // change in R²
console.log(reg.steps[1].result.Variable.includes('M')); // trueC) Logistic — classification with accuracy
const data = { X:[0,1,2,3,4], Y:[0,0,0,1,1] };
const logit = new Regression(data, { yName:'Y', xNames:['X'], type:'logistic' });
const s0 = logit.steps[0];
console.log(s0.result.Accuracy); // in [0,1]
console.log(s0.predict(s0.X)); // -> [0/1,...]
console.log(s0.predictProba(s0.X)); // -> probabilities in [0,1]Notes & tips
- If you omit
xNames, the wrapper uses all columns exceptyNameas predictors. next([...])creates a clone of the previous step’s columns and (if a name contains'*') generates the interaction term by multiplying the two source predictors element-wise.- Linear steps expose
StdError[]andpValue[]. Logistic steps exposeAccuracy. - The wrapper and cores are deterministic for the same inputs.
Linear Regression (Core)
Class: Regression.LinearRegression
Constructor
new Regression.LinearRegression(table: Record<string, number[]>, yName: string, xNames: string[], stepIndex: number)Public fields / getters
coefficients: number[]–[Intercept, β1, …].y: number[],X: number[][],yHat: number[]residuals: number[]r2: numberstandardErrors: number[]pValues: number[]n: number,k: number(obs & parameters)result: { step, n, Variable, Coefficient, StdError, pValue }htmlTable: string
Methods
calculate(): thispredict(X: number[][]): number[]
Logistic Regression (Core)
Class: Regression.LogisticRegression
Constructor
new Regression.LogisticRegression(table: Record<string, number[]>, yName: string, xNames: string[], stepIndex: number, learningRate=0.01, epochs=1000)Public fields / getters
coefficients: number[]–[Intercept, β1, …]y: number[],X: number[][],yHat: number[](predicted classes)accuracy: numbern: number,k: numberresult: { step, n, Variable, Coefficient, Accuracy }htmlTable: string
Methods
calculate(): thispredictProba(X: number[][]): number[]– probabilities via sigmoid.predict(X: number[][], threshold=0.5): number[]– hard labels.
Analyze — overview & patterns
This section ties together the shortcuts across Table, Statistics, and Analyze.
From Table to analysis
import { Table } from 'als-statistics';
import { Analyze } from 'als-statistics';
const t = new Table(data, { name: 'Survey' });
// Correlation (single pair)
const r1 = t.correlate('height','weight').pearson();
// Correlation matrix (3+ columns)
const rAll = t.correlate('height','weight','age').pearson();
// Compare means (Welch, unequal variances)
const w = t.compareMeans('groupA','groupB').independentWelch();
// One-way ANOVA (classic/Welch)
const a1 = t.compareMeans('A','B','C').anova();
const aW = t.compareMeans('A','B','C').anovaWelch();
// Regression (linear/logistic)
const lin = new Analyze.Regression(t.columns, { yName:'score', xNames:['age','hours'] });
const log = new Analyze.Regression(t.columns, { yName:'passed', xNames:['score'], type:'logistic' });Split → Combine (Statistics) → Analyze
// Split one table by a factor (returns Statistics with per-group tables)
const S = t.splitBy('group', { 0:'ctrl', 1:'treat' });
// Combine the same column across all split tables into one Table
const merged = S.columns('byGroup', 'score'); // -> ctrl_score, treat_score
// Now analyze as usual
const test = merged.compareMeans('ctrl_score','treat_score').independentWelch();Keep mutations API-only (
addRow,setAt,splice,values=). Avoid in-place array edits to preserve caches and consistent results.
Descriptive Statistics
Static utility functions used across the library. These are also mixed into Column (arity‑1 functions as getters; others as methods).
Selected functions
sum(values)·mean(values)·median(values)·mode(values)variance(values)·varianceSample(values)·stdDev(values)·stdDevSample(values)·cv(values)min(values)·max(values)·range(values)·iqr(values)·mad(values)zScore(values, v)·zScores(values)·zScoresSorted(values)percentile(values, p)·q1(values)·q3(values)·p10(values)·p90(values)weightedMean(values, weights)confidenceInterval({ mean, stdDevSample, values })outliersZScore(values, z=3)·outliersIQR(values)slope({ values })·regressionSlope({ X, Y })spectralPowerDensityArray(values)·spectralPowerDensityMetric(values)
Refer to JSDoc in code for exact parameter objects where applicable.
Utils
General helper utilities.
Functions
htmlTable(rows, headers, options?) → string– render a simple HTML table (escapes content; supportsfirstColHeader,fixeddecimals, transposition).round(value, fixed=8) → number | string– numeric rounding with fixed decimals.range(start=0, end, step=1) → number[]– numeric range.filterKeys(keys: string[], filters: (string|number|RegExp)[]): string[]– include names, regex filter, and'-name'exclusions.Counter– simple name counter withgetName(name?).
EPS (Golden Test Tolerances)
| Class | Key | Value | Notes |
|----------------|------------|---------|----------------------------------------|
| Descriptives | stat | 1e-6 | Means, medians, quantiles, variance |
| Z-scores | z | 1e-9 | Summary mean/std of z |
| Regression | reg | 1e-5 | Coefficients, metrics |
| CDF | cdf | 1e-9 | CDF/PPF checks |
| Correlations | r | 1e-7 | Pearson/Spearman/Kendall |
| Degrees of Fr. | df | 1e-6 | Welch df (float) |
| ANOVA F | anovaF | 1e-6 | |
| Flatness | flatness | 1e-12 | GM/AM stability |
| SPD Flatness | spd | 1e-12 | GM/AM on SPD |
| p-values | p | 1e-6 | |
Change these in goldens/settings.js if needed.
How‑to
Split a table by predicate and compare groups (Welch):
const { A, B } = Table.split(raw, r => r.group === 'A' ? 'A' : 'B'); const t = A.compareMeans('score','B.score').independentWelch();Detect z‑outliers and keep sorted indices:
const { zScores, indexes } = Stats.zScoresSorted({ values }); const top3 = indexes.slice(-3); // largest |z|Compute spectral flatness of a spectrum:
const spd = Stats.spectralPowerDensityArray({ values: magnitudes }); const flat = Stats.spectralPowerDensityMetric({ spectralPowerDensityArray: spd, values: magnitudes });
Live CodePen demos: add your links here.
Changelog [2.1.0] - 2025-09-04
Breaking change:
als-statisticsv2 is a ground-up rewrite with no backward compatibility with v1.x.
If you rely on v1: pin your dependency to the latest 1.x release.
npm i als-statistics@^1Changed
Stats.harmonicMean(...)— inputs≤ 0are now clamped to ε (1e-12) before computation (aligns with Python goldens), preventingNaN/division-by-zero surprises.Stats.zScores({ values }, sample = false)— added a second parameter:sample = false(default): population std (ddof = 0) — backward-compatible.sample = true: sample std (ddof = 1) — matches NumPy/SciPy z-scores and golden summaries.
Stats.flatness({ values })— now returns0when the arithmetic mean is0(previouslyNaN), making all-zero vectors well-defined.
Fixed
Stats.mad(...)— corrected median absolute deviation for edge cases.
Tests
- Added golden cross-checks against Python (NumPy/SciPy) and HDBSCAN labels; all pass within documented EPS tolerances.
Notes: Default behavior remains the same for
zScores(ddof=0) unlesssample=trueis provided. If your code relied onNaNfromflatness/harmonicMeanfor zero/negative inputs, update downstream checks accordingly.
