npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

table-header-detective

v1.0.0

Published

Intelligent detection of header rows in tabular data using multiple heuristics

Readme

🕵️ Table Header Detective

npm version License: MIT TypeScript

An intelligent utility for detecting header rows in tabular data using multiple heuristics and adaptive algorithms.

🔍 Features

  • Detects if tabular data has one or more header rows
  • Analyzes multiple factors to make an intelligent decision:
    • Type differences (strings vs numbers)
    • Capitalization patterns (Title Case, ALL CAPS)
    • Format consistency
    • Text length patterns
    • Special character usage
    • Empty cell analysis
    • Value uniqueness
    • Common header terminology
  • Advanced detection capabilities:
    • Multi-header detection for tables with title rows
    • Adaptive weighting based on table size and characteristics
    • Handling of challenging edge cases and varied formats
  • Returns detailed reasoning for why rows were classified as headers
  • Written in TypeScript with full type safety
  • Zero dependencies
  • Configurable detection parameters
  • Thoroughly tested with extensive test fixtures

📦 Installation

npm install table-header-detective
# or
yarn add table-header-detective

🚀 Quick Start

// Import the detector and all default heuristics
import { detectHeaderRows, DEFAULT_HEURISTICS } from "table-header-detective";

// Sample tabular data
const tableData = [
  ["ID", "Name", "Age", "Department", "Salary"],
  [1, "John Doe", 32, "Marketing", 58000],
  [2, "Jane Smith", 28, "Engineering", 72000],
  [3, "Bob Johnson", 45, "Finance", 91000],
];

// Detect header rows using all heuristics
const result = detectHeaderRows(tableData, {}, DEFAULT_HEURISTICS);

console.log(`Found ${result.headerRowIndices.length} header rows`);
console.log(`Confidence: ${(result.confidence * 100).toFixed(1)}%`);
console.log("Header rows:", result.headerRowIndices);

Multi-header Detection

The library can detect complex header structures with multiple consecutive header rows:

// Table with title and column headers
const multiHeaderTable = [
  ["QUARTERLY FINANCIAL REPORT", "", "", "", "", ""],
  ["Q1", "Q2", "Q3", "Q4", "YTD", "Budget"],
  [125000, 132000, 141000, 138000, 536000, 520000],
  [95000, 97000, 102000, 99000, 393000, 400000],
];

const result = detectHeaderRows(multiHeaderTable, {}, DEFAULT_HEURISTICS);
// Detects rows 0 and 1 as headers with high confidence

Optimizing Bundle Size

For production applications, you can reduce bundle size by only importing the specific heuristics you need:

import { detectHeaderRows } from "table-header-detective";
import { 
  TypeDifferenceHeuristic, 
  CapitalizationPatternHeuristic 
} from "table-header-detective";

// Use only specific heuristics to reduce bundle size
const result = detectHeaderRows(tableData, {}, [
  TypeDifferenceHeuristic,
  CapitalizationPatternHeuristic
]);

📋 API Reference

detectHeaderRows(rows, options?, heuristics?)

Main function that analyzes tabular data and detects header rows.

Parameters

  • rows: TableData - The tabular data to analyze as an array of arrays
  • options?: HeaderDetectionOptions - Optional configuration settings
  • heuristics?: HeaderHeuristic[] - Array of heuristics to use for detection. You must provide either specific heuristics or DEFAULT_HEURISTICS.

Returns

Returns a HeaderDetectionResult object containing:

  • headerRowIndices: number[] - Array of row indices identified as headers
  • confidence: number - Confidence level (0-1) that the detected rows are headers
  • message: string - Human-readable result message
  • details: RowDetail[] - Detailed analysis of each row

Available Heuristics

The library provides multiple heuristics, each analyzing different aspects of the table:

  • TypeDifferenceHeuristic - Analyzes data type patterns (strings vs. numbers)
  • CapitalizationPatternHeuristic - Detects typical header capitalization (ALL CAPS, Title Case)
  • FormatConsistencyHeuristic - Identifies format pattern differences between rows
  • TextLengthPatternHeuristic - Analyzes text length patterns (headers tend to be shorter)
  • SpecialCharUsageHeuristic - Examines special character usage in cells
  • EmptyCellAnalysisHeuristic - Analyzes patterns of empty cells
  • ValueUniquenessHeuristic - Checks for unique values typical of headers
  • HeaderTerminologyHeuristic - Detects common header terms and terminology

Import only the heuristics you need to optimize bundle size, or use DEFAULT_HEURISTICS to include all of them.

Adaptive Weighting System

The detector uses an intelligent adaptive weighting system that adjusts heuristic importance based on table characteristics:

  • For small tables (1-3 rows):

    • Upweights terminology-based detection
    • Downweights pattern-based heuristics that need more rows
  • For large tables (10+ rows):

    • Upweights pattern recognition heuristics
    • Better utilizes statistical patterns across larger datasets

This ensures optimal header detection regardless of table size or structure.

Configuration Options

You can customize the detection behavior:

import { detectHeaderRows, DEFAULT_HEURISTICS } from "table-header-detective";

const result = detectHeaderRows(myData, {
  // Minimum score to consider a row a header (0-1)
  headerThreshold: 0.6, // Default is 0.6

  // Maximum number of consecutive header rows to detect
  maxHeaderRows: 3, // Default is 3

  // Whether to apply first row bias (higher score for first row)
  applyFirstRowBias: true, // Default is true

  // Additional custom header terms to look for
  customHeaderTerms: ["code", "reference", "location"],
  
  // Minimum rows needed for sampling patterns
  minSampleRows: 2 // Default is 2
}, DEFAULT_HEURISTICS);

// You can also pass configuration options to specific heuristics
import { HeaderTerminologyHeuristic } from "table-header-detective";

const customTerminologyHeuristic = {
  ...HeaderTerminologyHeuristic,
  analyze: (rows) => HeaderTerminologyHeuristic.analyze(rows, {
    customHeaderTerms: ["product_id", "sku_code", "category_name"],
    matchPartialTerms: true,
    headerTermBoost: 0.5
  })
};

const result = detectHeaderRows(myData, options, [
  customTerminologyHeuristic,
  // ...other heuristics
]);

📊 Examples

Example 1: CSV Data

import { detectHeaderRows } from "table-header-detective";
import { DEFAULT_HEURISTICS } from "table-header-detective";
import { parse } from "csv-parse/sync";
import fs from "fs";

// Read and parse CSV
const csvData = fs.readFileSync("data.csv", "utf8");
const records = parse(csvData);

// Detect headers
const result = detectHeaderRows(records, {}, DEFAULT_HEURISTICS);

if (result.headerRowIndices.length > 0) {
  // Extract headers
  const headers = result.headerRowIndices.map((idx) => records[idx]);
  console.log("Detected headers:", headers);

  // Extract data (rows after headers)
  const lastHeaderIdx = Math.max(...result.headerRowIndices);
  const data = records.slice(lastHeaderIdx + 1);
  console.log(`${data.length} data rows found`);
}

Example 2: Exploring Detailed Results

import { detectHeaderRows } from "table-header-detective";

const data = [
  /* your table data */
];
const result = detectHeaderRows(data);

// Examine detailed analysis for each row
result.details.forEach((detail) => {
  console.log(`Row ${detail.rowIndex}:`);
  console.log(`  Is header: ${detail.isLikelyHeader}`);
  console.log(`  Header Type: ${detail.headerType}`);
  console.log(`  Confidence: ${(detail.confidence * 100).toFixed(1)}%`);
  console.log(`  Reasons:`);
  detail.reasons.forEach((reason) => {
    console.log(`    - ${reason}`);
  });
  console.log("");
});

🛠️ Advanced Usage

Custom Heuristic Combinations

You can select and configure specific heuristics based on your data characteristics:

import { detectHeaderRows } from "table-header-detective";
import { 
  TypeDifferenceHeuristic, 
  HeaderTerminologyHeuristic,
  EmptyCellAnalysisHeuristic
} from "table-header-detective";

// For tables with mixed data types and specialized terminology
const result = detectHeaderRows(tableData, options, [
  TypeDifferenceHeuristic,
  HeaderTerminologyHeuristic,
  EmptyCellAnalysisHeuristic
]);

Multi-Header Table Detection

The library is specially designed to recognize complex multi-header tables, like:

// Complex header with title rows and column headers
const complexHeaderTable = [
  ['EMPLOYEE INFORMATION', '', '', '', ''],
  ['FISCAL YEAR 2023', '', '', '', ''],
  ['ID', 'Name', 'Age', 'Department', 'Salary'],
  [1, 'John Doe', 32, 'Marketing', 58000],
  // More data rows...
];

const result = detectHeaderRows(complexHeaderTable, {}, DEFAULT_HEURISTICS);
// Will detect all three rows as headers (0, 1, and 2)

Creating Custom Heuristics

You can create your own custom heuristics by implementing the HeaderHeuristic interface:

import { HeaderHeuristic } from "table-header-detective";
import { TableData, HeuristicResult } from "table-header-detective";

// Custom heuristic for multi-line headers
const MultiLineHeaderHeuristic: HeaderHeuristic = {
  name: 'multiLineHeader',
  analyze: (rows: TableData): HeuristicResult => {
    // Your custom analysis logic here
    const rowScores: number[] = [];
    const reasons: string[][] = [];
    
    // Analyze each row and populate rowScores and reasons
    // ...
    
    return { rowScores, reasons };
  }
};

// Use your custom heuristic
const result = detectHeaderRows(tableData, options, [
  TypeDifferenceHeuristic,
  MultiLineHeaderHeuristic
]);

Helper Functions

The package exports additional helper functions for advanced usage:

  • isTitleCase(str) - Checks if a string follows Title Case convention
  • detectCellType(value) - Detects the data type of a cell value
  • getRowCompleteness(row, options) - Analyzes how complete a row is
  • normalizeValue(value, options) - Normalizes values for comparison

🧪 Testing

The library includes comprehensive test coverage for various table scenarios:

  • Basic header detection for simple tables
  • Complex multi-level header structures
  • Tables with various formatting challenges
  • Large tables with many rows
  • Edge cases and challenging patterns

Each heuristic is individually tested, and integration tests verify the system works correctly as a whole.

🔮 Future Enhancements

Future versions will add support for even more complex table structures:

  • Full multi-level header detection with column groups spanning multiple columns
  • Better detection of mid-table category headers
  • Support for merged cell patterns from spreadsheet exports
  • Handling of tables with irregular structures

📄 License

MIT © Lyle Underwood