npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@iyulab/u-insight

v0.3.1

Published

Statistical analysis and data profiling engine with C FFI bindings.

Readme

u-insight

Crates.io NuGet docs.rs CI

A statistical analysis and data profiling engine in Rust with C FFI bindings.

Overview

u-insight transforms raw tabular data into actionable statistical insights. It operates in two distinct layers with opposite assumptions about input data quality:

CSV (raw)
  │
  ├─→ Profiling ─→ "What is the state of this data?"
  │     Tolerates dirty data (missing values, type mismatches expected)
  │
  │   (external preprocessing)
  │
  └─→ Analysis  ─→ "What can we learn from this data?"
        Requires clean numeric data (no NaN, no missing)

Built on u-analytics (statistical algorithms), u-numflow (math primitives).

Modules

Data Layer

| Module | Description | |--------|-------------| | dataframe | Column-major tabular data model (DataFrame, Column, DataType) | | csv_parser | CSV parsing with automatic type inference | | error | Error types (InsightError) |

Profiling Layer (dirty data tolerated)

| Module | Description | |--------|-------------| | profiling | Column-level and dataset-level data profiling — descriptive stats, missing analysis, outlier flagging (IQR/Z-score/Modified Z-score), diagnostic flags |

Analysis Layer (clean data required)

| Module | Description | |--------|-------------| | analysis | Correlation (Pearson/Spearman), regression (simple/multiple OLS), Cramer's V contingency analysis | | clustering | K-Means++ (auto-K, Gap Statistic), Mini-Batch K-Means, DBSCAN, Hierarchical Agglomerative (Single/Complete/Average/Ward), HDBSCAN | | distribution | ECDF, histogram bins (Sturges/Scott/FD), QQ-plot, normality tests (KS, Jarque-Bera, Shapiro-Wilk, Anderson-Darling), Grubbs test, distribution fitting | | pca | Principal Component Analysis with auto-scaling option | | isolation_forest | Isolation Forest anomaly detection (Liu et al. 2008) | | lof | Local Outlier Factor (LOF) density-based anomaly detection | | mahalanobis | Mahalanobis distance multivariate outlier detection | | feature_importance | Variance threshold, correlation filter, VIF, condition number, composite importance, ANOVA F-test selection, Mutual Information, Permutation Importance |

FFI Layer

| Module | Description | |--------|-------------| | ffi | C FFI bindings — 32 functions, 20 #[repr(C)] structs, auto-generated C header via cbindgen |

Quick Start

use u_insight::csv_parser::CsvParser;
use u_insight::profiling::profile_dataframe;

// 1. Parse CSV
let csv = "name,value,active\nAlice,1.5,true\nBob,2.3,false\nCharlie,3.1,true\n";
let df = CsvParser::new().parse_str(csv).unwrap();

// 2. Profile
let profiles = profile_dataframe(&df);

Clustering

use u_insight::clustering::{kmeans, dbscan, KMeansConfig, DbscanConfig};

let data = vec![
    vec![0.0, 0.0], vec![0.5, 0.5],
    vec![10.0, 10.0], vec![10.5, 10.5],
];

// K-Means
let km = kmeans(&data, &KMeansConfig::new(2)).unwrap();
assert_eq!(km.k, 2);

// DBSCAN
let db = dbscan(&data, &DbscanConfig::new(1.5, 2)).unwrap();
assert_eq!(db.n_clusters, 2);

Distribution Analysis

use u_insight::distribution::{distribution_analysis, DistributionConfig};

let data: Vec<f64> = (0..50).map(|i| (i as f64 - 25.0) * 0.2).collect();
let result = distribution_analysis(&data, &DistributionConfig::default()).unwrap();
println!("Normal: {}", result.normality.is_normal);

C FFI

u-insight builds as cdylib + staticlib for cross-language interop. A C header (u_insight.h) is auto-generated by cbindgen at build time.

Profiling

| Function | Description | |----------|-------------| | insight_profile_csv | Profile a CSV string → opaque context | | insight_profile_free | Free profile context | | insight_profile_row_count | Row count from profile | | insight_profile_col_count | Column count from profile | | insight_profile_column | Get column summary |

Clustering

| Function | Description | |----------|-------------| | insight_kmeans | K-Means++ clustering | | insight_mini_batch_kmeans | Mini-Batch K-Means clustering | | insight_dbscan | DBSCAN density-based clustering | | insight_hierarchical | Hierarchical Agglomerative clustering (4 linkages) | | insight_hdbscan | HDBSCAN clustering with membership probabilities | | insight_gap_statistic | Gap statistic for optimal K selection |

Dimensionality Reduction

| Function | Description | |----------|-------------| | insight_pca | Principal Component Analysis |

Anomaly Detection

| Function | Description | |----------|-------------| | insight_isolation_forest | Isolation Forest anomaly detection | | insight_lof | Local Outlier Factor detection | | insight_mahalanobis | Mahalanobis distance outlier detection |

Statistical Analysis

| Function | Description | |----------|-------------| | insight_correlation | Pearson correlation matrix | | insight_regression | Simple linear regression | | insight_cramers_v | Cramer's V contingency analysis |

Distribution

| Function | Description | |----------|-------------| | insight_distribution | Normality testing (KS, JB, SW, AD) |

Feature Importance

| Function | Description | |----------|-------------| | insight_feature_importance | Composite feature importance scores | | insight_anova_select | ANOVA F-test feature selection | | insight_mutual_info | Mutual information feature ranking | | insight_permutation_importance | Permutation importance for regression |

Memory Management

| Function | Description | |----------|-------------| | insight_free_labels | Free u32 label arrays | | insight_free_i32_array | Free i32 arrays | | insight_free_f64_array | Free f64 arrays | | insight_free_anova_features | Free ANOVA feature arrays | | insight_free_mi_features | Free MI feature arrays | | insight_free_perm_features | Free permutation importance arrays |

Error & Version

| Function | Description | |----------|-------------| | insight_last_error | Last error message (thread-local) | | insight_clear_error | Clear error state | | insight_version | Library version string |

All FFI functions use catch_unwind to prevent panics from crossing the FFI boundary.

C# Binding (UInsight)

Install via NuGet — native libraries are bundled automatically:

dotnet add package UInsight
using UInsight;

using var client = new InsightClient();
Console.WriteLine(client.GetVersion());

var data = new double[,] { {0,0}, {1,1}, {10,10}, {11,11} };
var result = client.KMeans(data, k: 2);
Console.WriteLine($"K={result.K}, WCSS={result.Wcss:F2}");

The binding is in bindings/csharp/UInsight/ with:

  • Interop/NativeLibrary.cs[LibraryImport] declarations for all 32 FFI functions
  • Interop/NativeStructs.cs[StructLayout] mappings for all 20 C structs
  • InsightClient.cs — High-level managed API (automatic memory management)
  • InsightException.cs — Error code to exception conversion

Test Status

357 lib tests + 49 doc-tests = 406 total
0 clippy warnings
Build: lib + cdylib + staticlib
C header: auto-generated via cbindgen (20 structs, 32 functions)

Scope & Non-Goals

In Scope:

  • Data profiling (dirty data → quality report + diagnostic flags)
  • Statistical analysis (clean data → patterns + relationships)
  • Correlation, regression, clustering, PCA, anomaly detection
  • Feature importance and selection (ANOVA, MI, Permutation)
  • Distribution analysis and normality testing
  • C FFI for cross-language use
  • C# binding (UInsight NuGet package)

Out of Scope:

  • Visualization / charting
  • Data cleaning / transformation / imputation
  • ML model training / deployment
  • Deep learning

Requirements

  • Rust 1.75+
  • Dependencies: u-analytics, u-numflow

Related

License

MIT License