tgobi

v0.1.3

Published

2 months ago

Interactive high-dimensional data visualization in the browser.

0High
0Medium
0Low

fcela

fcela-npm

ggobi visualization grand-tour projection-pursuit react

tgobi

Interactive high-dimensional data visualization in the browser, inspired by GGobi. Explore data through linked plots, animated tours, clustering, classification, and dimensionality reduction --- all without leaving your browser.

tgobi: linked brushing, grand tour, parallel coordinates, and barcharts

Install

npm install -g tgobi

Use a local install when tgobi is part of a project:

npm install tgobi
npm exec tgobi

npm install tgobi puts the executable at node_modules/.bin/tgobi for that project. It does not make tgobi available as a bare shell command unless that directory is on your PATH. These are equivalent ways to run a local install:

npm exec tgobi
npx tgobi
./node_modules/.bin/tgobi

Use a global install when you want tgobi available directly from your shell:

npm install -g tgobi
tgobi

Command Line

The CLI serves the built standalone app and opens it in your browser:

tgobi
tgobi --port 8787
tgobi --host 0.0.0.0 --no-open

From this repository, build first and run the source checkout directly:

npm run build
node bin/tgobi.js --no-open

Screenshots

Load data from disk or start with a bundled sample:

tgobi file and sample loader

Open the variables panel and add linked plots:

tgobi flea sample workspace

Brush in one plot to highlight the same rows in every linked view, color by a categorical variable, and run a grand tour from the tour panel:

tgobi linked brushing and tour

Combine multiple plot types like parallel coordinates and barcharts to explore complex relationships across dimensions. Selections are instantly linked across all views, allowing you to highlight subsets in a categorical barchart and immediately observe their structural distribution in a 2D grand tour or high-dimensional parallel coordinates projection:

tgobi multi-plot analysis with tour, parallel coordinates, and barcharts

Working With Data

tgobi accepts CSV, TSV, JSON, and GGobi-style XML files from the start screen. After loading a file, the schema preview lets you confirm inferred column types before committing the dataset.

The bundled samples are useful for quick checks:

flea: small categorical dataset for brushing, color, and tour examples.
olive: regional olive oil measurements.
places: mixed geographic and numeric data.
cycle: XML sample for GGobi import coverage.
large: synthetic large dataset for performance testing.

Using The App

Plots

Add plots with + Plot. Supported plot types:

| Type | Description | |------|-------------| | Scatterplot | Two numeric variables, x-y | | Scatterplot matrix | 2-8 numeric variables, all pairwise scatterplots | | Parallel coordinates | 2+ numeric variables, linked axes | | Dotplot | Single numeric variable, 1D strip | | Barchart | Single variable (categorical or numeric), frequency counts | | Boxplot | Single numeric variable with optional grouping; shows median, quartiles, whiskers, outliers | | Time series | Numeric x-axis, one or more y variables with optional grouping | | Missing pattern | Overview of missingness across all variables |

Multiple plots are linked: selecting, painting, or hovering in one plot highlights the same rows in every other plot.

Brushing and Painting

Use the brush toolbar to select rows:

Transient: selection disappears when you release the mouse.
Persistent: each brush stroke paints a group with a distinct color (paint groups 1-8).

The selection toolbar offers:

| Button | Action | |--------|--------| | Exclude | Hide selected rows (shadow/ghost them) | | Include | Restore selected shadowed rows | | Invert | Flip which rows are shadowed | | Isolate | Hide everything except the selected rows | | Restore | Bring all rows back (clear shadow mask) |

Filtering and Excluding Rows

Shadowed (excluded) rows are dimmed in every plot and excluded from all computations --- tour, clustering, classification, and projection skip them. The status bar shows "N of M visible" where N counts only non-shadowed rows.

Common workflow --- exclude a category (e.g. remove one region from the olive oil dataset):

Add a Barchart of the categorical variable (e.g. region).
Click the bar for the category you want to remove. This selects all rows in that category (linked across every plot).
Click Exclude in the selection toolbar. The rows are now shadowed.
Repeat for any other categories you want to exclude.
To bring rows back: select them and click Include, or click Restore to un-shadow everything.

Alternative --- keep only a subset:

Select the rows you want to keep (brush in a scatterplot, click bars in a barchart, or drag a range in a boxplot).
Click Isolate ("Exclude all but selected"). Everything else is shadowed.

Exclude by boxplot range:

Add a Boxplot of the numeric variable.
Click the box body to select all rows in that group, or drag vertically on the boxplot to select a value range.
Click Exclude to shadow the selected rows.

Coloring

The color toolbar controls how points are colored:

Fixed: all points in one color.
Paint: color by painted group (persistent brushing).
By variable: color by a data column. Categorical variables pair well with tableau10; numeric variables support sequential or diverging scales.

Identify Tool

Switch to the Identify tool to hover over points and see their row label. Click to pin a label; click again to unpin. Set the label variable in the identify toolbar.

Keyboard Shortcuts

Press ? in the app to see the shortcut reference.

| Key | Action | |-----|--------| | B | Switch to brush tool | | I | Switch to identify tool | | T | Toggle transient / persistent brush mode | | E | Exclude selected rows | | R | Restore all excluded rows | | Space | Play / pause tour | | Esc | Clear selection or stop tour |

Shortcuts are ignored when focus is in an input, select, or textarea, or when meta/ctrl/alt is held.

Data Export

Click Export CSV in the toolbar to download the current dataset as a CSV file. The export respects the current view:

Visible only: by default, excluded (shadowed) rows are omitted.
Paint groups: appends a _paint_group column when rows have been painted.
Cluster labels: appends a _cluster column when clustering has been applied.

Edges

Load an edges layer (e.g. a graph or path) alongside your data. Edge visibility, alpha, and color mode are configurable. You can also draw sequential edges that connect rows in dataset order.

Hulls

Toggle convex hulls per paint group or color group to visually enclose clusters in scatterplots.

Right Sidebar Tabs

The right sidebar has four tabs: Tour, Project, Cluster, and Classify.

Tour Tab

Animate projections through high-dimensional space. Requires a scatterplot (2D tour) or dotplot (1D tour) to be open.

Shape:

2D (scatter): rotates a 2D projection plane through p-dimensional space.
1D (dotplot): rotates a 1D projection direction.

Modes:

| Mode | Description | |------|-------------| | Grand | Randomly walks through all projection planes. Good for overview. | | Projection pursuit | Steers the tour toward projections that optimize an index. | | Manual | Fixes all variables except one, letting you scrub that variable's contribution with a slider. |

Projection pursuit goals:

| Goal | Optimizes | When to use | |------|-----------|-------------| | Holes | 1 - central density | Finding projections with hollow structure (clusters on the rim) | | Central mass | Central density | Finding projections with dense centers | | LDA | Between-class / within-class variance | Requires 2+ painted groups; finds projections that separate groups | | PCA variance | Total variance in projection | Finds projections that spread data out most | | Kurtosis | Absolute excess kurtosis | Finding heavy-tailed or multi-modal structure |

The variable circle shows each variable's current contribution as a point on a unit circle. Frozen variables hold their direction while others rotate.

Saved views: click Save to bookmark the current projection. Click a saved view to restore it.

Project Tab

Compute a static low-dimensional embedding and add it to the dataset as new columns.

Methods:

| Method | Type | Output | Loadings | |--------|------|--------|----------| | PCA | Linear | Orthogonal components maximizing variance | Yes (eigenvectors) | | MDS | Distance-based | Preserves pairwise distances | Permutation importance | | ICA | Linear | Statistically independent components | Yes (mixing matrix) | | t-SNE | Nonlinear | Preserves local neighborhoods | Permutation importance | | UMAP | Nonlinear | Preserves local+global structure | Permutation importance |

Controls:

Method: choose the algorithm.
Dims: number of output dimensions (2+).
Variables: check which numeric columns to include.
Method-specific parameters (perplexity/iterations for t-SNE, neighbors/min dist for UMAP).

After computing:

X / Y: pick which dimensions to plot.
Add to data: materializes the embedding as new columns (e.g. PCA.1, PCA.2) and opens a scatterplot.
Clear: resets the projection.

Component information:

For PCA and ICA, the panel displays a loadings table showing how much each original variable contributes to each component. Headers are labeled PC1, PC2, ... (PCA) or IC1, IC2, ... (ICA). Values with |loading| > 0.5 are highlighted. A cumulative variance row (Cum %) shows running explained variance for PCA.

For MDS, t-SNE, and UMAP, a variable importance table ranks variables by how much the embedding changes when that variable is permuted (permutation importance, 3 repetitions). This identifies which variables most influence the nonlinear structure.

See Methods Guide for the mathematical details.

Cluster Tab

Assign cluster labels to rows and paint them with distinct colors.

Methods:

| Method | Type | Key parameter | When to use | |--------|------|---------------|-------------| | K-Means | Fixed k | k | Known number of clusters, spherical clusters | | Hierarchical | Fixed k | k, linkage | Small datasets, dendrogram-style | | DBSCAN | Density-based | eps, minPts | Arbitrary shapes, noise detection | | OPTICS | Density-based | eps, minPts, xi | Variable-density clusters | | X-Means | Auto k | kMax | Unknown number of clusters (uses BIC) |

Workflow:

Check the numeric variables to cluster on.
Set method and parameters.
Click Compute.
Click Paint to color rows by cluster assignment.

Linkage options (hierarchical): complete, single, average.

X-Means iterates k = 1..kMax and picks the best k by Bayesian Information Criterion. OPTICS extracts clusters using the xi steepness parameter.

Classify Tab

Build a classifier from painted groups and visualize the decision boundary in any plot --- including animated tours. Inspired by R's classifly package: instead of shading regions, tgobi samples the predictor space on a grid, asks the trained model what it would predict at each grid point, and keeps only those grid points where the prediction changes between neighbors (the neighbor-disagreement rule). Those boundary points are rendered as outline rings, colored by their predicted class.

Methods:

| Method | Key parameter | Description | |--------|---------------|-------------| | KNN | k | k-nearest neighbors with calibrated neighbor-fraction probabilities | | Naive Bayes | - | Gaussian naive Bayes with softmax posterior | | Logistic | lambda, iter | Multinomial logistic regression, L2 regularized | | Random Forest | trees, max depth | Bagged decision trees with per-class vote ratio |

All four methods return calibrated per-class probabilities that drive the Uncertainty filter.

Workflow:

Brush 2+ groups of points (persistent mode) --- these become the training labels. Alternatively, set Class to a categorical variable in the data.
Check the numeric variables you want the model to use.
Boundary mode: choose either 2D slice (grid varies only along the first 2 selected variables, others held at their training-set medians) or Full space (grid varies along every predictor). Full-space grids stay tour-meaningful in any projection but the point count grows as resolution^p; tgobi caps the total at 200 000 and shows the effective resolution next to the input.
Pick the Grid resolution. The label next to it shows the projected point count, e.g. 5×5 = 25 pts or 7⁶ = 117 649 pts (capped from 15).
Click Train, then Show to draw the boundary rings.

Uncertainty filter (slider): each boundary point also carries the classifier's 1 - max(class probability) at that location. Drag the slider to hide confident points and keep only the uncertain ones. A live N of M shown counter ticks down as you raise the threshold.

Misclassified training points render as an X-cross over their painted glyph, so you can see at a glance which examples the model disagrees with.

Where boundaries appear:

The boundary is an overlay layer, not synthetic data rows. It draws in scatter and scatterplot-matrix plots whose axes are predictors, and in a running 2D tour over the same (or a superset of the) predictor variables. It does not appear in the missing-pattern view, parallel coordinates, boxplots, or CSV export --- those views see the original data unchanged.

In a tour, the boundary grid is standardized the same way the tour worker standardizes the data and then multiplied by the active basis, so the rings stay aligned with the rotating clusters. Tour-active variables that aren't predictors contribute nothing to the projection (their standardized value is 0).

Train/test split (optional): when enabled, the labeled data is split stratified by class. The diagnostics panel reports test-set accuracy and a 5-fold cross-validation estimate alongside the training-set confusion matrix.

Methods Guide

See docs/methods.md for the mathematical foundations of each algorithm, key equations, and implementation notes.

Embed In React

import { Tgobi } from "tgobi";
import "tgobi/styles.css";

export function MyPage() {
  return (
    <div style={{ height: "100vh" }}>
      <Tgobi />
    </div>
  );
}

You can pass a DataFrame-compatible object as data:

<Tgobi data={myDataFrame} />

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

tgobi

Install

Command Line

Screenshots

Working With Data

Using The App

Plots

Brushing and Painting

Filtering and Excluding Rows

Coloring

Identify Tool

Keyboard Shortcuts

Data Export

Edges

Hulls

Right Sidebar Tabs

Tour Tab

Project Tab

Cluster Tab

Classify Tab

Methods Guide

Embed In React