npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@rastaweb/domoscope

v1.0.0

Published

Domoscope is an HTML diff engine with intelligent DOM comparison, configurable tracking, and comprehensive statistics

Downloads

48

Readme

🔍 Domoscope

Advanced HTML diff engine with intelligent DOM comparison and comprehensive change statistics.

Domoscope is a TypeScript library for comparing HTML content with intelligent element matching, word-level text diffing, and detailed change tracking.

� Installation

npm install @rastaweb/domoscope
yarn add @rastaweb/domoscope
pnpm add @rastaweb/domoscope

Requirements: Node.js ≥16.0.0, TypeScript ≥4.5.0 (optional)

🚀 Quick Start

Basic Usage

import { getCustomDiffStats } from '@rastaweb/domoscope';

const oldHTML = '<div><p>Original content</p></div>';
const newHTML = '<div><p>Modified content</p><span>Added content</span></div>';

const { diffResult, stats } = getCustomDiffStats(oldHTML, newHTML);

console.log(`Added ${stats.totalAddedTags} elements`);
console.log(`Removed ${stats.totalRemovedTags} elements`);
console.log(`Changed ${stats.totalChangedTags} elements`);

TypeScript Usage

import {
  getCustomDiffStats,
  compareElements,
  formatTagStatsSummary,
  type DiffStats,
  type ExtendedCompareOptions,
} from '@rastaweb/domoscope';

const options: ExtendedCompareOptions = {
  addedClass: 'highlight-added',
  removedClass: 'highlight-removed',
  watchedTags: ['img', 'a', 'button'],
  minSimilarityThreshold: 0.3,
};

const result = getCustomDiffStats(oldHTML, newHTML, options);
const summary = formatTagStatsSummary(result.stats);

📚 API Reference

Core Functions

getCustomDiffStats(oldHTML, newHTML, options?)

High-level function that parses HTML, performs diff, and returns both modified DOM and statistics.

function getCustomDiffStats(
  oldHTML: string,
  newHTML: string,
  options?: ExtendedCompareOptions
): DiffResultWithStats;

Parameters:

  • oldHTML (string): Original HTML content
  • newHTML (string): Modified HTML content
  • options (ExtendedCompareOptions, optional): Configuration options

Returns: Object with diffResult and stats properties

Example:

const result = getCustomDiffStats('<div>Old</div>', '<div>New</div>', {
  addedClass: 'added',
  removedClass: 'removed',
});

compareElements(oldElements, newElements, options?)

Compare arrays of DOM elements with intelligent pairing and recursive diffing.

function compareElements(
  oldElements: Element[],
  newElements: Element[],
  options?: ExtendedCompareOptions
): void;

Parameters:

  • oldElements (Element[]): Array of original elements
  • newElements (Element[]): Array of modified elements
  • options (ExtendedCompareOptions, optional): Configuration options

Side Effects: Modifies the DOM in-place with diff annotations

Example:

const oldTree = stringToFlatTree('<div><p>Content</p></div>');
const newTree = stringToFlatTree('<div><p>New content</p></div>');
compareElements(oldTree.rootElements, newTree.rootElements);

formatTagStatsSummary(stats)

Generate human-readable summary of diff statistics.

function formatTagStatsSummary(stats: DiffStats): string;

Parameters:

  • stats (DiffStats): Statistics object from diff operation

Returns: Multi-line string with formatted statistics

Example:

const summary = formatTagStatsSummary(stats);
console.log(summary); // "DOMOSCOPE DIFF STATISTICS\n  Added: 2 elements..."

getChangedTagsList(stats)

Extract list of changed tags with their attributes.

function getChangedTagsList(stats: DiffStats): Array<{
  tagName: string;
  count: number;
  changedAttributes: string[];
}>;

Utility Functions

stringToFlatTree(html)

Parse HTML string into structured tree representation.

function stringToFlatTree(html: string): {
  rootElements: Element[];
  allElements: Element[];
};

Time Complexity: O(n) where n is number of DOM nodes
Space Complexity: O(n)


validateHTML(html)

Validate HTML string and return parsing information.

function validateHTML(html: string): {
  isValid: boolean;
  errors: string[];
};

Algorithm Functions

computeLCS(a, b)

Compute Longest Common Subsequence using dynamic programming.

function computeLCS(a: string[], b: string[]): Array<[number, number]>;

Time Complexity: O(a × b)
Space Complexity: O(min(a, b))


elementSimilarity(elementA, elementB, enableMemoization?)

Calculate similarity score between two DOM elements.

function elementSimilarity(
  elementA: Element,
  elementB: Element,
  enableMemoization?: boolean
): number;

Returns: Similarity score (higher = more similar)

Scoring Algorithm:

  • ID exact match: +10 points
  • Tag name match: +5 points
  • Class overlap: +N points (N = shared classes)
  • Attribute similarity: +0.5 × N points
  • Text content overlap: +0.3 × N points

computeWordDiff(oldText, newText)

Perform word-level diff on text content.

function computeWordDiff(oldText: string, newText: string): Token[];

Returns: Array of tokens with change types (equal, added, removed)


Performance Functions

clearCaches()

Clear all memoization caches to free memory.

function clearCaches(): void;

getCacheStats()

Get cache performance statistics.

function getCacheStats(): {
  lcsCache: { size: number; hits: number; misses: number };
  similarityCache: { size: number; hits: number; misses: number };
};

getPerformanceMetrics()

Get detailed performance metrics.

function getPerformanceMetrics(): PerformanceMetrics;

⚙️ Configuration Options

ExtendedCompareOptions

Complete configuration interface combining style, tracking, and performance options.

interface ExtendedCompareOptions {
  // Style Configuration
  addedClass?: string; // Default: "diff-added"
  removedClass?: string; // Default: "diff-removed"
  elementChangeClass?: string; // Default: "diff-elem-changed"
  attributeChangeClass?: string; // Default: "diff-attr-changed"
  wrapperTag?: string; // Default: "span"
  textWrapperTag?: string; // Default: same as wrapperTag
  addedWrapperTag?: string; // Default: same as wrapperTag
  removedWrapperTag?: string; // Default: same as wrapperTag
  changedWrapperTag?: string; // Default: same as wrapperTag

  // Tracking Configuration
  watchedTags?: string[]; // Tags to track for special handling
  trackedTags?: string[] | Record<string, string[]>; // Tags and attributes to track
  trackedAttributes?: string[]; // Global attribute filter

  // Performance Configuration
  maxTextLength?: number; // Default: 10000
  minSimilarityThreshold?: number; // Default: 0
  enableMemoization?: boolean; // Default: true
  ignoreWhitespaceTexts?: boolean; // Default: false

  // Custom Handlers
  onElementChange?: ElementChangeHandler;
}

Common Configuration Examples

// Basic styling
const styleConfig = {
  addedClass: 'highlight-green',
  removedClass: 'highlight-red',
  wrapperTag: 'mark',
};

// Performance optimization
const performanceConfig = {
  minSimilarityThreshold: 0.3,
  maxTextLength: 5000,
  enableMemoization: true,
};

// Selective tracking
const trackingConfig = {
  watchedTags: ['img', 'a', 'button'],
  trackedAttributes: ['href', 'src', 'class', 'id'],
};

// Combined configuration
const fullConfig = {
  ...styleConfig,
  ...performanceConfig,
  ...trackingConfig,
};

🔄 Algorithm Flow & Implementation

System Overview

flowchart TD
    A[HTML Input] --> B[HTML Parsing]
    B --> C[Element Arrays]
    C --> D[Element Matching]
    D --> E[Recursive Comparison]
    E --> F[Text Diffing]
    F --> G[Statistics Collection]
    G --> H[Annotated DOM + Stats]

    subgraph "Element Matching Algorithm"
        D1[Similarity Matrix] --> D2[Best Match Selection]
        D2 --> D3[Pairing Results]
    end

    subgraph "Text Diffing Process"
        F1[Tokenization] --> F2[LCS Computation]
        F2 --> F3[Token Classification]
        F3 --> F4[DOM Annotation]
    end

    D --> D1
    F --> F1

Core Diff Algorithm Steps

  1. HTML Parsing: Parse input strings into DOM element trees using stringToFlatTree()
  2. Element Pool Creation: Create sets of old and new elements for matching
  3. Similarity Computation: Calculate similarity scores using multi-factor algorithm
  4. Element Pairing: Find optimal element matches using similarity thresholds
  5. Recursive Processing: For paired elements, recursively compare child nodes
  6. LCS Alignment: Align child nodes using Longest Common Subsequence algorithm
  7. Text Diffing: Perform word-level diff on text content with Unicode support
  8. DOM Annotation: Apply CSS classes and wrapper elements to indicate changes
  9. Statistics Collection: Gather comprehensive metrics about detected changes

Element Similarity Algorithm

flowchart LR
    A[Element A] --> C[Similarity Calculator]
    B[Element B] --> C
    C --> D[ID Match: +10]
    C --> E[Tag Match: +5]
    C --> F[Class Overlap: +N]
    C --> G[Attribute Similarity: +0.5N]
    C --> H[Text Overlap: +0.3N]
    C --> I[Structure Score: +1]
    D --> J[Total Score]
    E --> J
    F --> J
    G --> J
    H --> J
    I --> J

Text Diffing Process

  1. Tokenization: Split text into words and punctuation using Unicode-aware regex
  2. LCS Computation: Find longest common subsequence of tokens
  3. Classification: Mark tokens as equal, added, or removed
  4. Merging: Combine consecutive tokens of same type
  5. DOM Generation: Create document fragments with appropriate wrapper elements

📊 Data Types

DiffStats

Comprehensive statistics about detected changes.

interface DiffStats {
  totalChangedTags: number; // Elements with tag/attribute changes
  totalAddedTexts: number; // Added text spans/nodes
  totalRemovedTexts: number; // Removed text spans/nodes
  totalAddedTags: number; // Newly added elements
  totalRemovedTags: number; // Removed elements
  totalAddedWords: number; // Total words added
  totalRemovedWords: number; // Total words removed

  addedTags?: Record<string, number>; // Per-tag addition counts
  removedTags?: Record<string, number>; // Per-tag removal counts
  changedTags?: Record<
    string,
    {
      // Per-tag change details
      count: number;
      changedAttributes: string[];
    }
  >;
}

DiffResultWithStats

Complete result including both DOM modifications and statistics.

interface DiffResultWithStats {
  diffResult: {
    oldRootElements: Element[]; // Root elements from old content
    newRootElements: Element[]; // Root elements from new content
    rootElements: Element[]; // All root elements (compatibility)
    allElements: Element[]; // All elements from both trees
  };
  stats: DiffStats;
}

Token

Individual unit in word-level diff.

interface Token {
  type: 'equal' | 'added' | 'removed';
  text: string;
}

� Error Handling & Common Issues

HTML Parsing Errors

// Validate HTML before processing
const validation = validateHTML(htmlString);
if (!validation.isValid) {
  console.error('HTML validation failed:', validation.errors);
}

Performance Issues

// For large documents, adjust performance settings
const options = {
  maxTextLength: 1000, // Limit text diff size
  minSimilarityThreshold: 0.5, // Raise threshold for faster matching
  enableMemoization: true, // Enable caching
};

Memory Management

// Clear caches periodically for long-running applications
import { clearCaches } from '@rastaweb/domoscope';

clearCaches(); // Frees all memoization memory

Common Pitfalls

  1. Large Text Blocks: Word-level diffing becomes slow with very large text. Use maxTextLength option.
  2. Memory Leaks: Clear caches in long-running applications to prevent memory growth.
  3. Invalid HTML: Always validate HTML input, especially from user sources.
  4. Case Sensitivity: Element tag names are case-insensitive, but attributes are case-sensitive.

🎯 Best Practices & Performance Tips

Optimal Configuration

// For content management systems
const cmsConfig = {
  watchedTags: ['img', 'a', 'video', 'iframe'],
  trackedAttributes: ['src', 'href', 'class'],
  minSimilarityThreshold: 0.3,
  maxTextLength: 5000,
};

// For code diff (low similarity tolerance)
const codeConfig = {
  minSimilarityThreshold: 0.8,
  enableMemoization: true,
  wrapperTag: 'mark',
};

// For large documents (performance focused)
const performanceConfig = {
  minSimilarityThreshold: 0.5,
  maxTextLength: 2000,
  enableMemoization: true,
  ignoreWhitespaceTexts: true,
};

Memory Optimization

// Monitor cache performance
const cacheStats = getCacheStats();
if (cacheStats.lcsCache.size > 1000) {
  clearCaches();
}

// Disable memoization for one-time operations
getCustomDiffStats(oldHTML, newHTML, { enableMemoization: false });

DOM Structure Recommendations

  • Use semantic HTML for better element matching
  • Include stable id attributes for important elements
  • Use consistent class naming for similar content types
  • Avoid deeply nested structures when possible

🔧 Runtime Behavior & Lifecycle

Initialization

// Library is stateless - no global initialization required
import { getCustomDiffStats } from '@rastaweb/domoscope';

// Each function call is independent
const result = getCustomDiffStats(html1, html2);

Memory Management

  • Caches: LCS and similarity computations are memoized by default
  • Cleanup: Caches auto-expire after 5 minutes
  • Size Limits: Caches are limited to 1000 entries each
  • Manual Control: Use clearCaches() for explicit cleanup

Concurrency Model

  • Synchronous: All operations are synchronous - no async/await needed
  • Thread Safe: Pure functions with no shared mutable state
  • Browser Compatible: Works in both Node.js and browser environments

🧪 Examples & Use Cases

Content Management System

// Track content changes in CMS
const { stats } = getCustomDiffStats(originalArticle, editedArticle, {
  watchedTags: ['img', 'a', 'blockquote'],
  trackedAttributes: ['src', 'href', 'alt'],
});

console.log(
  `Article edited: ${stats.totalAddedWords} words added, ${stats.totalRemovedWords} removed`
);

Version Control Interface

// Show file differences in version control UI
const { diffResult } = getCustomDiffStats(oldVersion, newVersion, {
  addedClass: 'git-added',
  removedClass: 'git-removed',
  wrapperTag: 'mark',
});

// Render diffResult.rootElements in UI

Automated Testing

// Assert content changes in tests
const { stats } = getCustomDiffStats(beforeHTML, afterHTML);
expect(stats.totalAddedTags).toBe(1);
expect(stats.addedTags?.button).toBe(1);

Email Template Comparison

// Compare email template versions
const { stats } = getCustomDiffStats(template1, template2, {
  watchedTags: ['img', 'a', 'table'],
  trackedAttributes: ['src', 'href', 'style', 'width', 'height'],
});

const report = formatTagStatsSummary(stats);

📈 Performance Characteristics

Algorithm Complexity

| Operation | Time Complexity | Space Complexity | Notes | | --------------------- | --------------- | ---------------- | -------------------------- | | HTML Parsing | O(n) | O(n) | n = DOM nodes | | Element Matching | O(n×m×k) | O(n+m) | k = similarity computation | | LCS Computation | O(a×b) | O(min(a,b)) | a,b = token arrays | | Text Tokenization | O(t) | O(tokens) | t = text length | | Statistics Collection | O(elements) | O(tags) | Linear scan |

Memory Usage

  • Base Library: ~50KB minified
  • Cache Memory: ~1MB max (auto-managed)
  • DOM Overhead: Proportional to input size
  • Peak Usage: 3-5x input HTML size during processing

Performance Benchmarks

  • Small Documents (<1KB): <1ms
  • Medium Documents (10KB): 10-50ms
  • Large Documents (100KB): 100-500ms
  • Very Large Documents (1MB+): Use performance settings

⚙️ Compatibility & Requirements

Environment Support

  • Node.js: ≥16.0.0
  • TypeScript: ≥4.5.0 (optional)
  • Browsers: Modern browsers with ES2022 support
  • Module Formats: ESM only (use type: "module")

Dependencies

  • Runtime: None (zero dependencies)
  • Peer Dependencies: TypeScript ≥4.5.0 (optional)
  • Dev Dependencies: Jest, TypeScript, ESLint, Prettier

Browser Compatibility

  • Chrome: ≥91
  • Firefox: ≥90
  • Safari: ≥14
  • Edge: ≥91

🧪 Testing & Examples

Running Tests

npm test              # Run test suite
npm run test:watch    # Watch mode
npm run test:coverage # Coverage report

Example Projects

  • examples/comprehensive-examples.mjs - Complete usage examples
  • playground/playground.html - Interactive browser demo
  • tests/simple.test.js - Basic functionality tests

Test Output Example

PASS tests/simple.test.js
✓ should perform basic diff operation (5ms)
✓ should generate statistics summary (3ms)
✓ should handle empty content (2ms)

Test Suites: 1 passed, 1 total
Tests: 3 passed, 3 total

📋 API Surface Summary

| Export | Type | Description | | ----------------------- | -------- | ----------------------------- | | getCustomDiffStats | Function | Main high-level diff function | | compareElements | Function | Element array comparison | | formatTagStatsSummary | Function | Statistics formatting | | getChangedTagsList | Function | Extract changed tag info | | stringToFlatTree | Function | HTML parsing utility | | validateHTML | Function | HTML validation | | computeLCS | Function | LCS algorithm | | elementSimilarity | Function | Element similarity scoring | | computeWordDiff | Function | Word-level text diff | | clearCaches | Function | Memory management | | getCacheStats | Function | Cache performance metrics | | DiffEngine | Class | Core diff engine | | StatsCollector | Class | Statistics collection | | ConfigBuilder | Class | Fluent configuration | | ConfigPresets | Object | Predefined configurations |


📄 License

MIT License - see LICENSE file for details.

Repository: https://github.com/rastaweb/domoscope
Issues: https://github.com/rastaweb/domoscope/issues
Author: kamran taghinejad

  • LCS Algorithm: Optimized Longest Common Subsequence implementation with dynamic programming
  • Text-Level Diffing: Word-by-word and character-level comparison with tokenization
  • Element Similarity Scoring: Multi-factor scoring including tag names, attributes, and content

🎨 Configuration & Customization

  • Fluent Builder API: ConfigBuilder with method chaining for easy configuration
  • Configuration Presets: Pre-built configurations for common scenarios (CMS, forms, navigation, performance)
  • Flexible Tracking: Configurable tag and attribute tracking with wildcard support
  • Custom CSS Classes: Configurable styling for added, removed, and changed content
  • Wrapper Element Control: Customizable HTML wrapper tags for different change types
  • Element Change Handlers: Custom callbacks for handling specific element changes

📊 Advanced Statistics & Analytics

  • Comprehensive Change Metrics: Detailed statistics with per-tag breakdown
  • Performance Monitoring: Built-in timing and cache performance metrics
  • Accurate Change Counting: Precise statistics that count element changes once (not per DOM tree)
  • Changed Tags Analysis: Detailed tracking of which tags and attributes changed
  • Statistics Formatting: Human-readable summary formatting for debugging and reporting

⚡ Performance & Optimization

  • Memoization & Caching: Advanced caching with configurable TTL and size limits
  • Dynamic Programming: Space-optimized algorithms for large content comparison
  • Cache Management: Manual cache control with statistics and configuration
  • Performance Metrics: Detailed timing breakdown for pairing, LCS, and text diffing
  • Configurable Thresholds: Similarity thresholds and text length limits for optimization

🧩 Architecture & Engineering

  • Modular Architecture: SOLID principles with dependency inversion and clean interfaces
  • TypeScript First: Complete type safety with branded types and strict null checking
  • ES Modules: Modern module system with proper exports and imports
  • Universal Compatibility: Browser & Node.js support with ES modules and CommonJS
  • Extensible Design: Plugin-friendly architecture for custom extensions

🌍 Text & Internationalization

  • Unicode Support: Enhanced tokenization for international text and complex scripts
  • Multi-language Text Processing: Persian, Arabic, Chinese, and complex script handling
  • Smart Tokenization: Context-aware text splitting with punctuation and whitespace handling
  • HTML Validation: Built-in HTML parsing and validation utilities

🔧 Developer Experience

  • Interactive Playground: Built-in HTML playground for testing and experimentation
  • Algorithm Transparency: Detailed flow documentation with visual algorithm diagrams
  • Comprehensive API: Multiple levels of API from high-level to low-level utilities
  • Error Handling: Robust error handling with detailed error messages
  • Configuration Validation: Built-in validation for configuration options

🔬 Algorithm Flow Diagram

The core diff algorithm follows a sophisticated multi-stage process:

flowchart TD
    subgraph Input["🔄 Input Processing"]
        A1[HTML String 1] --> B1[Parse & Validate]
        A2[HTML String 2] --> B2[Parse & Validate]
        B1 --> C1[stringToFlatTree]
        B2 --> C2[stringToFlatTree]
        C1 --> D1[Element Arrays]
        C2 --> D2[Element Arrays]
    end

    subgraph Matching["🎯 Element Matching Phase"]
        D1 --> E[Element Pool Creation]
        D2 --> E
        E --> F[Similarity Matrix Computation]
        F --> G{elementSimilarity}
        G --> H[Best Match Selection]
        H --> I[Pairing Results]

        subgraph SimilarityAlgo["📏 Similarity Algorithm"]
            G1[ID Exact Match: +10]
            G2[Tag Name Match: +5]
            G3[Class Overlap: +N]
            G4[Attribute Similarity: +0.5*N]
            G5[Text Token Overlap: +0.3*N]
            G6[Structure Similarity: +1]
        end

        G --> SimilarityAlgo
    end

    subgraph Processing["⚙️ Diff Processing Phase"]
        I --> J[Paired Elements]
        I --> K[Unmatched Old]
        I --> L[Unmatched New]

        J --> M{compareNode}
        K --> N[Mark as Removed]
        L --> O[Mark as Added]

        subgraph NodeComparison["🔍 Node Comparison"]
            M1[Element Change Detection]
            M2[Child Node Alignment]
            M3[LCS Algorithm]
            M4[Text Content Diffing]
            M5[Recursive Processing]
        end

        M --> NodeComparison
    end

    subgraph LCS["📐 LCS Algorithm Detail"]
        P1[Build Node Keys]
        P2[Dynamic Programming Matrix]
        P3[Optimal Path Backtracking]
        P4[Match Sequence Generation]

        P1 --> P2
        P2 --> P3
        P3 --> P4
    end

    subgraph TextDiff["📝 Text Diffing Algorithm"]
        T1[tokenize Text]
        T2[LCS on Tokens]
        T3[Build Diff Tokens]
        T4[Merge Consecutive]
        T5[fragmentFromTokens]

        T1 --> T2
        T2 --> T3
        T3 --> T4
        T4 --> T5
    end

    subgraph Output["📊 Output Generation"]
        Q1[DOM with Annotations]
        Q2[Statistics Collection]
        Q3[Performance Metrics]
        Q4[Formatted Results]
    end

    NodeComparison --> LCS
    NodeComparison --> TextDiff
    N --> Q1
    O --> Q1
    NodeComparison --> Q1
    Q1 --> Q2
    Q2 --> Q3
    Q3 --> Q4

    style Input fill:#e1f5fe
    style Matching fill:#f3e5f5
    style Processing fill:#e8f5e8
    style LCS fill:#fff3e0
    style TextDiff fill:#fce4ec
    style Output fill:#f1f8e9

📦 Installation

# npm
npm install domoscope

# yarn
yarn add domoscope

# pnpm
pnpm add domoscope

# bun
bun add domoscope

Browser Usage

<!-- ES Modules (Recommended) -->
<script type="module">
  import { getCustomDiffStats } from './node_modules/domoscope/dist/index.js';
  window.domoscope = { getCustomDiffStats };
</script>

<!-- Legacy Browser Support -->
<script type="module" src="./node_modules/domoscope/browser-bundle.js"></script>

CDN Usage

<script type="module">
  import { getCustomDiffStats } from 'https://unpkg.com/domoscope/dist/index.js';
</script>

🚀 Quick Start

Basic Usage

import { getCustomDiffStats, formatTagStatsSummary } from 'domoscope';

const oldHTML = '<div><p>Original content</p></div>';
const newHTML = '<div><p>Modified content</p><img src="new.jpg" alt="New image"></div>';

// Generate diff with comprehensive statistics
const { diffResult, stats } = getCustomDiffStats(oldHTML, newHTML);

// Display the annotated results
document.body.appendChild(diffResult.rootElements[0]); // Old version with diff highlights
document.body.appendChild(diffResult.rootElements[1]); // New version with diff highlights

// Show detailed statistics
console.log(formatTagStatsSummary(stats));
// Output:
// ═════════════════════════════════════
//        DOMOSCOPE DIFF STATISTICS
// ═════════════════════════════════════
// Total Changed Tags: 2
// Total Elements: 4
// Performance: 1.23ms
// ═════════════════════════════════════

Advanced Configuration

import { ConfigBuilder, getCustomDiffStats, getPerformanceMetrics } from 'domoscope';

// Use fluent configuration API
const config = new ConfigBuilder()
  .watchTags('div', 'p', 'span')
  .trackAttributes('class', 'id', 'data-value')
  .withPerformance({
    minSimilarityThreshold: 0.7,
    enableMemoization: true,
    maxTextLength: 10000,
  })
  .build();

const result = getCustomDiffStats(oldHTML, newHTML, config);

// Access performance metrics
const metrics = getPerformanceMetrics();
console.log(`LCS computation: ${metrics.lcsTime}ms`);
console.log(`Element pairing: ${metrics.pairingTime}ms`);
console.log(`Cache efficiency: ${metrics.cacheHits}/${metrics.cacheMisses}`);

Preset Configurations

import { ConfigPresets, getCustomDiffStats } from 'domoscope';

// Basic configuration with minimal tracking
const basicResult = getCustomDiffStats(oldHTML, newHTML, ConfigPresets.basic());

// Content Management System optimized
const cmsResult = getCustomDiffStats(oldHTML, newHTML, ConfigPresets.cms());

// Form elements comparison
const formsResult = getCustomDiffStats(oldHTML, newHTML, ConfigPresets.forms());

// Navigation elements comparison
const navResult = getCustomDiffStats(oldHTML, newHTML, ConfigPresets.navigation());

// Performance-focused (minimal tracking)
const fastResult = getCustomDiffStats(oldHTML, newHTML, ConfigPresets.performance());

document.body.appendChild(diffResult.rootElements[1]); // New version with highlights

// Print statistics console.log(formatTagStatsSummary(stats));


### Configuration Presets

```typescript
import { getCustomDiffStats, ConfigPresets } from 'domoscope';

// Use preset configurations for common scenarios
const cmsConfig = ConfigPresets.cms(); // Content management optimized
const formConfig = ConfigPresets.forms(); // Form diffing optimized
const navConfig = ConfigPresets.navigation(); // Navigation diffing
const perfConfig = ConfigPresets.performance(); // High performance

const { diffResult, stats } = getCustomDiffStats(oldHTML, newHTML, cmsConfig);

Custom Configuration

import { getCustomDiffStats, ConfigBuilder } from 'domoscope';

const customConfig = new ConfigBuilder()
  .withStyles({
    addedClass: 'my-added',
    removedClass: 'my-removed',
    elementChangeClass: 'my-changed',
  })
  .trackTags(['p', 'div', 'span'])
  .trackAttributes('class', 'id', 'data-value')
  .watchTags('img', 'video', 'iframe')
  .withPerformance({
    maxTextLength: 5000,
    enableMemoization: true,
  })
  .build();

const result = getCustomDiffStats(oldHTML, newHTML, customConfig);

📝 Comprehensive Examples

Example 1: Added and Removed Tags

import { getCustomDiffStats, formatTagStatsSummary } from 'domoscope';

const oldHTML = `
<div class="content">
  <h1>Article Title</h1>
  <p>Original paragraph content.</p>
  <ul>
    <li>Item 1</li>
    <li>Item 2</li>
  </ul>
</div>
`;

const newHTML = `
<div class="content">
  <h1>Article Title</h1>
  <p>Modified paragraph content with more details.</p>
  <blockquote>This is a new quote that was added.</blockquote>
  <ul>
    <li>Item 1</li>
    <li>Item 2</li>
    <li>Item 3</li>
  </ul>
  <img src="diagram.png" alt="New diagram" />
</div>
`;

// Generate diff with comprehensive tracking
const { diffResult, stats } = getCustomDiffStats(oldHTML, newHTML, {
  addedClass: 'highlight-added',
  removedClass: 'highlight-removed',
  watchedTags: ['blockquote', 'img', 'li'], // Watch for these tag additions/removals
});

// Display results
document.getElementById('old-version').appendChild(diffResult.rootElements[0]);
document.getElementById('new-version').appendChild(diffResult.rootElements[1]);

console.log(formatTagStatsSummary(stats));
// Output shows:
// - Added 1 blockquote element
// - Added 1 img element
// - Added 1 li element
// - Text changes in 1 p element

Example 2: Text and Word-Level Changes

import { getCustomDiffStats } from 'domoscope';

const oldHTML = `
<article>
  <h2>Product Review</h2>
  <p>This product is good and works well for basic needs.</p>
  <p>The price is reasonable at $50.</p>
</article>
`;

const newHTML = `
<article>
  <h2>Product Review</h2>
  <p>This product is excellent and works perfectly for advanced needs.</p>
  <p>The price is very reasonable at $45 with discount.</p>
</article>
`;

const { diffResult, stats } = getCustomDiffStats(oldHTML, newHTML, {
  addedClass: 'word-added',
  removedClass: 'word-removed',
  wrapperTag: 'mark', // Use <mark> tags for highlighting
});

// The result will show:
// - "good" → "excellent" (removed/added words)
// - "well" → "perfectly" (removed/added words)
// - "basic" → "advanced" (removed/added words)
// - "$50" → "$45 with discount" (removed/added words)

console.log(`Changed words: +${stats.totalAddedWords} -${stats.totalRemovedWords}`);
console.log(`Text nodes modified: ${stats.totalChangedTags}`);

Example 3: Attribute Changes

import { getCustomDiffStats, getChangedTagsList } from 'domoscope';

const oldHTML = `
<div class="container">
  <img src="old-image.jpg" alt="Old description" width="300" />
  <a href="/old-link" title="Old title">Click here</a>
  <button type="button" disabled>Submit</button>
</div>
`;

const newHTML = `
<div class="container updated">
  <img src="new-image.jpg" alt="Updated description" width="400" height="300" />
  <a href="/new-link" title="Updated title" target="_blank">Click here</a>
  <button type="submit">Submit</button>
</div>
`;

const { diffResult, stats } = getCustomDiffStats(oldHTML, newHTML, {
  attributeChangeClass: 'attr-changed',
  elementChangeClass: 'element-modified',
  trackedTags: {
    img: ['src', 'alt', 'width', 'height'],
    a: ['href', 'title', 'target'],
    button: ['type', 'disabled'],
    div: ['class'],
  },
});

// Get detailed list of changes
const changes = getChangedTagsList(stats);
changes.forEach(({ tagName, count, changedAttributes }) => {
  console.log(`${tagName}: ${count} elements changed`);
  console.log(`  Attributes: ${changedAttributes.join(', ')}`);
});

// Expected output:
// div: 1 elements changed
//   Attributes: class
// img: 1 elements changed
//   Attributes: src, alt, width, height
// a: 1 elements changed
//   Attributes: href, title, target
// button: 1 elements changed
//   Attributes: type, disabled

Example 4: Complex Mixed Changes

import { getCustomDiffStats, ConfigBuilder } from 'domoscope';

const oldHTML = `
<section class="blog-post">
  <header>
    <h1>How to Use APIs</h1>
    <p class="meta">Published on 2024-01-15</p>
  </header>
  <main>
    <p>APIs are powerful tools for developers.</p>
    <code>fetch('/api/data')</code>
    <p>They allow seamless data exchange.</p>
  </main>
</section>
`;

const newHTML = `
<section class="blog-post featured">
  <header>
    <h1>How to Use REST APIs</h1>
    <p class="meta updated">Published on 2024-01-15, Updated on 2024-10-21</p>
    <div class="tags">
      <span class="tag">API</span>
      <span class="tag">Tutorial</span>
    </div>
  </header>
  <main>
    <p>REST APIs are powerful tools for modern developers.</p>
    <pre><code>fetch('/api/v2/data')</code></pre>
    <p>They allow seamless and efficient data exchange.</p>
    <p>Here's an example of error handling:</p>
    <code>try { ... } catch (error) { ... }</code>
  </main>
</section>
`;

const config = new ConfigBuilder()
  .withStyles({
    addedClass: 'diff-added',
    removedClass: 'diff-removed',
    elementChangeClass: 'diff-changed',
    attributeChangeClass: 'diff-attr-changed',
  })
  .trackTags(['section', 'h1', 'p', 'code', 'pre', 'div', 'span'])
  .trackAttributes('class')
  .watchTags('div', 'span', 'pre') // Watch for structural additions
  .build();

const { diffResult, stats } = getCustomDiffStats(oldHTML, newHTML, config);

// Detailed analysis
console.log('=== CHANGE SUMMARY ===');
console.log(`Total elements changed: ${stats.totalChangedTags}`);
console.log(`Elements added: ${stats.totalAddedTags}`);
console.log(`Words added: ${stats.totalAddedWords}`);
console.log(`Words removed: ${stats.totalRemovedWords}`);

// Per-tag breakdown
if (stats.addedTags) {
  console.log('\n=== ADDED ELEMENTS ===');
  Object.entries(stats.addedTags).forEach(([tag, count]) => {
    console.log(`+${count} ${tag} element(s)`);
  });
}

if (stats.changedTags) {
  console.log('\n=== CHANGED ELEMENTS ===');
  Object.entries(stats.changedTags).forEach(([tag, data]) => {
    console.log(`~${data.count} ${tag} element(s) modified`);
    if (data.changedAttributes.length > 0) {
      console.log(`  Attributes: ${data.changedAttributes.join(', ')}`);
    }
  });
}

// Expected output:
// === CHANGE SUMMARY ===
// Total elements changed: 4
// Elements added: 5
// Words added: 12
// Words removed: 4
//
// === ADDED ELEMENTS ===
// +1 div element(s)
// +2 span element(s)
// +1 pre element(s)
// +1 p element(s)
//
// === CHANGED ELEMENTS ===
// ~1 section element(s) modified
//   Attributes: class
// ~1 h1 element(s) modified
// ~1 p element(s) modified
//   Attributes: class
// ~1 code element(s) modified

Example 5: CSS Styling for Visual Diff

Add this CSS to visualize the changes:

/* Added content styling */
.diff-added {
  background-color: #d4edda;
  color: #155724;
  padding: 2px 4px;
  border-radius: 3px;
  border-left: 3px solid #28a745;
}

.highlight-added {
  background-color: #28a745;
  color: white;
  font-weight: bold;
  padding: 1px 3px;
  border-radius: 2px;
}

/* Removed content styling */
.diff-removed {
  background-color: #f8d7da;
  color: #721c24;
  padding: 2px 4px;
  border-radius: 3px;
  border-left: 3px solid #dc3545;
  text-decoration: line-through;
}

.highlight-removed {
  background-color: #dc3545;
  color: white;
  font-weight: bold;
  padding: 1px 3px;
  border-radius: 2px;
  text-decoration: line-through;
}

/* Changed elements styling */
.diff-changed {
  border: 2px dashed #ffc107;
  padding: 4px;
  border-radius: 4px;
  background-color: #fff3cd;
}

/* Attribute changes styling */
.diff-attr-changed {
  outline: 2px dotted #17a2b8;
  outline-offset: 2px;
  background-color: #d1ecf1;
}

/* Word-level changes */
.word-added {
  background-color: #90ee90;
  font-weight: bold;
}

.word-removed {
  background-color: #ffb6c1;
  text-decoration: line-through;
}

/* Element modifications */
.element-modified {
  box-shadow: 0 0 5px rgba(255, 193, 7, 0.5);
}

.attr-changed {
  border-bottom: 2px wavy #007bff;
}

Modular Imports

Domoscope supports modular imports for tree-shaking and reduced bundle size:

// Import only what you need
import { getCustomDiffStats } from 'domoscope';
import { ConfigBuilder } from 'domoscope/config';
import { computeLCS, elementSimilarity } from 'domoscope/algorithms';
import { stringToFlatTree, validateHTML } from 'domoscope/utils';
import { DiffEngine, StatsCollector } from 'domoscope/core';

// Or import specific types
import type { DiffStats, ExtendedCompareOptions } from 'domoscope/types';

Available Module Paths:

  • domoscope - Main entry point with all functionality
  • domoscope/config - Configuration builders and presets
  • domoscope/algorithms - Core algorithms and performance utilities
  • domoscope/utils - DOM manipulation and utility functions
  • domoscope/core - Core diff engine and statistics collector
  • domoscope/types - TypeScript type definitions

🎛️ API Reference

Core Functions

getCustomDiffStats(oldHTML, newHTML, options?)

High-level function that parses HTML, performs diffing, and collects statistics.

function getCustomDiffStats(
  oldHTML: string,
  newHTML: string,
  options?: ExtendedCompareOptions
): DiffResultWithStats;

Returns:

  • diffResult.rootElements: Array of root elements from both trees
  • diffResult.allElements: Array of all elements
  • stats: Comprehensive statistics object

compareElements(oldElements, newElements, options?)

Compare two arrays of DOM elements directly.

function compareElements(
  oldElements: Element[],
  newElements: Element[],
  options?: ExtendedCompareOptions
): void;

collectDiffStats(rootElements, options?)

Analyze diffed DOM elements and extract statistics.

function collectDiffStats(rootElements: Element[], options?: ExtendedCompareOptions): DiffStats;

formatTagStatsSummary(stats)

Create a formatted summary of diff statistics for debugging and reporting.

function formatTagStatsSummary(stats: DiffStats): string;

getChangedTagsList(stats)

Get a simple list of which tags were changed and what attributes changed.

function getChangedTagsList(stats: DiffStats): Array<{
  tagName: string;
  count: number;
  changedAttributes: string[];
}>;

Algorithm Functions

computeLCS(a, b, config?)

Compute Longest Common Subsequence with memoization.

function computeLCS(a: string[], b: string[], config?: LCSConfig): LCSMatch[];

elementSimilarity(a, b)

Calculate similarity score between two elements.

function elementSimilarity(a: Element, b: Element): SimilarityScore;

tokenize(text)

Tokenize text for word-level diffing with enhanced Unicode support.

function tokenize(text: string): Token[];

computeWordDiff(oldText, newText, maxLength?)

Compute word-level differences between two text strings.

function computeWordDiff(
  oldText: string,
  newText: string,
  maxLength?: number
): Array<{ type: 'equal' | 'added' | 'removed'; text: string }>;

Utility Functions

stringToFlatTree(html)

Parse HTML string into a flat tree structure.

function stringToFlatTree(html: string): ParsedTree;

validateHTML(html)

Validate HTML string and return parsing information.

function validateHTML(html: string): {
  isValid: boolean;
  errors: string[];
  warnings: string[];
};

nodeKey(node)

Generate a unique key for DOM node identification.

function nodeKey(node: Node): string;

wrapElement(element, className, wrapperTag?)

Wrap an element with a wrapper containing the specified class.

function wrapElement(element: Element, className: string | undefined, wrapperTag?: string): void;

Performance & Cache Management

clearCaches()

Clear all internal memoization caches.

function clearCaches(): void;

getCacheStats()

Get current cache performance statistics.

function getCacheStats(): {
  lcsCache: { size: number; hits: number; misses: number };
  similarityCache: { size: number; hits: number; misses: number };
};

getPerformanceMetrics()

Get detailed performance metrics from the last operations.

function getPerformanceMetrics(): PerformanceMetrics;

resetPerformanceMetrics()

Reset performance metrics counters.

function resetPerformanceMetrics(): void;

configureCaching(options)

Configure cache behavior and limits.

function configureCaching(options: { ttl?: number; maxSize?: number; enabled?: boolean }): void;

Core Classes

DiffEngine

Main diff engine for advanced usage.

class DiffEngine {
  constructor(options: ExtendedCompareOptions);
  compareElements(oldElements: Element[], newElements: Element[]): void;
}

StatsCollector

Statistics collection and analysis.

class StatsCollector {
  constructor(config: ExtendedCompareOptions);
  collectStats(rootElements: Element[]): DiffStats;
}

Configuration

ConfigBuilder

Fluent interface for building configurations:

const config = new ConfigBuilder()
  .withStyles({ addedClass: 'added', removedClass: 'removed' })
  .withTracking({ trackedTags: ['p', 'div'], trackedAttributes: ['class', 'id'] })
  .trackTags({ img: ['src', 'alt'], a: ['href'] })
  .trackAttributes('class', 'id')
  .watchTags('img', 'video')
  .withPerformance({ maxTextLength: 10000, enableMemoization: true })
  .withElementChangeHandler((oldEl, newEl, changeType, changedAttrs) => {
    // Custom element change handling
  })
  .build();

ConfigBuilder Methods:

  • withStyles(styleConfig): Set CSS classes and wrapper tags
  • withTracking(trackingConfig): Configure tag and attribute tracking
  • withPerformance(performanceConfig): Set performance optimization options
  • withElementChangeHandler(handler): Set custom element change handler
  • trackTags(...tags): Configure specific tags to track for changes
  • trackAttributes(...attributes): Set attributes to track globally
  • watchTags(...tags): Configure tags to watch for additions/removals

ConfigPresets

Pre-built configurations for common use cases:

// Basic configuration with minimal tracking
const basicConfig = ConfigPresets.basic();

// Content management system optimized
const cmsConfig = ConfigPresets.cms();

// Form elements optimized
const formsConfig = ConfigPresets.forms();

// Navigation elements optimized
const navConfig = ConfigPresets.navigation();

// High-performance optimized
const perfConfig = ConfigPresets.performance();

Available Presets:

  • ConfigPresets.basic(): Minimal configuration with default settings
  • ConfigPresets.cms(): Optimized for content management (p, h1-h6, div, span tracking)
  • ConfigPresets.forms(): Optimized for form elements (input, select, textarea, button)
  • ConfigPresets.navigation(): Optimized for navigation (a, nav, ul, li elements)
  • ConfigPresets.performance(): High-performance with reduced processing

validateConfig(config)

Validate configuration options and get detailed error information.

function validateConfig(config: ExtendedCompareOptions): {
  isValid: boolean;
  errors: string[];
};

### Advanced Usage

#### Custom Element Change Handler

```typescript
const config = new ConfigBuilder()
  .withElementChangeHandler((oldEl, newEl, changeType, changedAttrs) => {
    if (changeType === 'attribute' && newEl?.tagName === 'IMG') {
      // Custom handling for image changes
      const wrapper = document.createElement('div');
      wrapper.className = 'image-change-indicator';

      if (changedAttrs?.includes('src')) {
        const badge = document.createElement('span');
        badge.textContent = 'Image Updated';
        wrapper.appendChild(badge);
      }

      return wrapper; // Custom wrapper element
    }

    return undefined; // Use default handling
  })
  .build();

Performance Monitoring

import { getPerformanceMetrics, resetPerformanceMetrics } from 'domoscope';

resetPerformanceMetrics();

// Perform diff operations...
getCustomDiffStats(oldHTML, newHTML);

const metrics = getPerformanceMetrics();
console.log(`Pairing time: ${metrics.pairingTime}ms`);
console.log(`LCS time: ${metrics.lcsTime}ms`);
console.log(`Cache hits: ${metrics.cacheHits}`);

🎨 CSS Styling

Add these CSS classes to style the diff results:

/* Added content */
.diff-added {
  background-color: #e6ffe6;
  color: #006600;
  text-decoration: none;
}

/* Removed content */
.diff-removed {
  background-color: #ffe6e6;
  color: #660000;
  text-decoration: line-through;
}

/* Changed elements */
.diff-elem-changed {
  border: 2px solid #ffa500;
  border-radius: 3px;
}

/* Changed attributes */
.diff-attr-changed {
  outline: 2px dotted #0066cc;
  outline-offset: 2px;
}

📊 Statistics Object

The DiffStats object provides comprehensive change metrics:

interface DiffStats {
  /** Number of elements with tag or attribute changes */
  totalChangedTags: number;

  /** Number of added text spans/nodes */
  totalAddedTexts: number;

  /** Number of removed text spans/nodes */
  totalRemovedTexts: number;

  /** Number of newly added elements */
  totalAddedTags: number;

  /** Number of removed elements */
  totalRemovedTags: number;

  /** Total number of words added across all text content */
  totalAddedWords: number;

  /** Total number of words removed across all text content */
  totalRemovedWords: number;

  /** Per-tag statistics for added elements (e.g., { a: 5, img: 2 }) */
  addedTags?: Record<string, number>;

  /** Per-tag statistics for removed elements (e.g., { a: 2, span: 10 }) */
  removedTags?: Record<string, number>;

  /** Per-tag statistics for changed elements with detailed attribute info */
  changedTags?: Record<
    string,
    {
      count: number;
      changedAttributes: string[];
    }
  >;
}

Usage Example:

const { stats } = getCustomDiffStats(oldHTML, newHTML);

console.log(`Total changes: ${stats.totalChangedTags}`);
console.log(`Added elements: ${stats.totalAddedTags}`);
console.log(`Removed elements: ${stats.totalRemovedTags}`);
console.log(`Added words: ${stats.totalAddedWords}`);
console.log(`Removed words: ${stats.totalRemovedWords}`);

// Per-tag breakdown
if (stats.addedTags) {
  Object.entries(stats.addedTags).forEach(([tag, count]) => {
    console.log(`Added ${count} ${tag} elements`);
  });
}

if (stats.changedTags) {
  Object.entries(stats.changedTags).forEach(([tag, data]) => {
    console.log(`Changed ${data.count} ${tag} elements:`);
    console.log(`  Attributes: ${data.changedAttributes.join(', ')}`);
  });
}

🏗️ Architecture

Domoscope follows SOLID principles with a clean, modular architecture:

src/
├── types/           # TypeScript type definitions
├── config/          # Configuration management
├── algorithms/      # Core algorithms with memoization
├── utils/           # DOM manipulation utilities
├── core/            # Main diff engine and statistics
└── index.ts         # Public API exports

Key Components

  • DiffEngine: Core comparison algorithm
  • StatsCollector: Statistics gathering and analysis
  • ConfigBuilder: Fluent configuration interface
  • Algorithm modules: LCS, similarity, and word diffing with optimization

📝 TypeScript Types

Domoscope exports comprehensive TypeScript types for full type safety:

Core Types

// Token types for text diffing
type TokenType = 'equal' | 'added' | 'removed';
type Token = { type: TokenType; text: string };

// Result types
interface DiffResult {
  rootElements: Element[];
  allElements: Element[];
}

interface DiffResultWithStats {
  diffResult: DiffResult;
  stats: DiffStats;
}

Configuration Types

// Style configuration
interface StyleConfig {
  addedClass?: string;
  removedClass?: string;
  elementChangeClass?: string;
  attributeChangeClass?: string;
  wrapperTag?: string;
  textWrapperTag?: string;
  addedWrapperTag?: string;
  removedWrapperTag?: string;
  changedWrapperTag?: string;
}

// Tracking configuration
interface TrackingConfig {
  watchedTags?: string[];
  trackedTags?: string[] | Record<string, string[]>;
  trackedAttributes?: string[];
}

// Performance configuration
interface PerformanceConfig {
  maxTextLength?: number;
  minSimilarityThreshold?: number;
  enableMemoization?: boolean;
  ignoreWhitespaceTexts?: boolean;
}

// Complete configuration
interface ExtendedCompareOptions extends StyleConfig, TrackingConfig, PerformanceConfig {
  onElementChange?: ElementChangeHandler;
}

Handler Types

type ElementChangeHandler = (
  oldEl: Element | null,
  newEl: Element | null,
  changeType: 'tag' | 'attribute' | 'tag-added' | 'tag-removed',
  changedAttrs?: string[]
) => void | Element | null;

Algorithm Types

// Internal algorithm types for advanced usage
interface LCSMatch {
  oldIndex: number;
  newIndex: number;
  length: number;
}

interface SimilarityScore {
  score: number;
  factors: {
    tagMatch: number;
    attributeMatch: number;
    contentMatch: number;
    structureMatch: number;
  };
}

interface PerformanceMetrics {
  pairingTime: number;
  lcsTime: number;
  textDiffTime: number;
  elementsProcessed: number;
  cacheHits: number;
  cacheMisses: number;
}

🔧 Configuration Options

Style Configuration

interface StyleConfig {
  addedClass?: string; // CSS class for added content
  removedClass?: string; // CSS class for removed content
  elementChangeClass?: string; // CSS class for changed elements
  attributeChangeClass?: string; // CSS class for attribute changes
  wrapperTag?: string; // HTML tag for wrappers
}

Tracking Configuration

interface TrackingConfig {
  watchedTags?: string[]; // Tags for special handling. Use ['*'] to watch all tags
  trackedTags?: string[] | Record<string, string[]>; // Tags to track
  trackedAttributes?: string[]; // Attributes to track
}

Performance Configuration

interface PerformanceConfig {
  maxTextLength?: number; // Max text length for word diffing
  minSimilarityThreshold?: number; // Min similarity for element pairing
  enableMemoization?: boolean; // Enable caching
}

📈 Performance

Domoscope is optimized for performance with several strategies:

  • Dynamic Programming: LCS algorithm with memoization
  • Intelligent Caching: Similarity scores and computation results
  • Efficient Algorithms: O(n*m) complexity with space optimization
  • Configurable Thresholds: Skip expensive operations when appropriate

Benchmarks

| Elements | Time (ms) | Memory (MB) | | -------- | --------- | ----------- | | 100 | ~5 | ~2 | | 1,000 | ~45 | ~15 | | 10,000 | ~450 | ~120 |

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

  • Inspired by modern diff algorithms and DOM manipulation techniques
  • Built with TypeScript for maximum developer experience
  • Optimized using dynamic programming patterns

Domoscope - Advanced HTML diffing for the modern web. 🔍

Public API Functions

stringToFlatTree(html: string)

export function stringToFlatTree(html: string): {
  rootElements: Element[];
  allElements: Element[];
};

Purpose: Parses HTML string into a structured DOM representation for processing.

Algorithm:

  1. Creates temporary container element
  2. Sets innerHTML to parse the HTML
  3. Recursively traverses all descendants to build flat element list
  4. Returns both root-level elements and complete element inventory

Usage Example:

const { rootElements, allElements } = stringToFlatTree('<div><p>Hello</p></div>');
console.log(rootElements.length); // 1 (the div)
console.log(allElements.length); // 2 (div + p)

Performance Notes: Uses native browser HTML parsing for optimal speed. The flat traversal enables efficient similarity comparisons later.

flowchart TD
    A["HTML String"] --> B["Create temp container"]
    B --> C["Set innerHTML"]
    C --> D["Extract root elements"]
    D --> E["Recursive traverse"]
    E --> F["Build allElements array"]
    F --> G["Return rootElements + allElements"]

📚 Detailed Algorithm Documentation

Core Algorithm Flow

sequenceDiagram
    participant Input as HTML Input
    participant Parser as HTML Parser
    participant Matcher as Element Matcher
    participant LCS as LCS Engine
    participant Differ as Text Differ
    participant Output as Annotated DOM

    Input->>Parser: Parse HTML strings
    Parser->>Parser: Validate & sanitize
    Parser->>Matcher: Element arrays

    Matcher->>Matcher: Compute similarity matrix
    Note over Matcher: O(n×m×k) complexity
    Matcher->>Matcher: Find optimal pairings

    Matcher->>LCS: Aligned element pairs
    LCS->>LCS: Child node alignment
    Note over LCS: Dynamic programming O(a×b)
    LCS->>Differ: Text content pairs

    Differ->>Differ: Tokenize & compute word diff
    Note over Differ: Enhanced Unicode tokenization
    Differ->>Output: Annotated fragments

    LCS->>Output: Structure with diff markers
    Matcher->>Output: Element change annotations
    Output->>Output: Collect statistics

1. Element Similarity Algorithm

The core matching algorithm uses a multi-factor scoring system:

function elementSimilarity(a: Element, b: Element): number {
  let score = 0;

  // 🎯 ID exact match (highest priority)
  if (a.id && b.id && a.id === b.id) {
    score += 10; // Strong identity signal
  }

  // 🏷️ Tag name compatibility
  if (a.tagName === b.tagName) {
    score += 5; // Structural similarity
  }

  // 🎨 Class overlap analysis
  const classIntersection = getClassIntersection(a, b);
  score += classIntersection.length; // +1 per shared class

  // 📋 Attribute similarity
  const attrSimilarity = computeAttributeSimilarity(a, b);
  score += attrSimilarity * 0.5; // Weighted attribute score

  // 📝 Text content analysis
  const textSimilarity = computeTextSimilarity(a.textContent, b.textContent);
  score += textSimilarity * 0.3; // Content relevance

  // 🏗️ Structural compatibility
  const structSimilarity = computeStructuralSimilarity(a, b);
  score += structSimilarity; // Child count & nesting

  return score;
}

Similarity Scoring Breakdown

| Factor | Weight | Description | Example Impact | | ------------------------ | ------------- | ---------------------- | ------------------------------------ | | ID Match | 10.0 | Exact ID equality | <div id="header"> matches strongly | | Tag Match | 5.0 | Same HTML tag | <p> prefers <p> over <div> | | Class Overlap | 1.0 per class | Shared CSS classes | .nav.active vs .nav.hidden = 1.0 | | Attribute Similarity | 0.5 × count | Similar attributes | data-*, aria-* attributes | | Text Similarity | 0.3 × tokens | Shared text tokens | Common words/phrases | | Structure Match | 0.5-1.0 | Child count similarity | Similar nesting patterns |

2. LCS (Longest Common Subsequence) Engine

Algorithm Selection Strategy

flowchart LR
    A["Input Arrays"] --> B{"Size Check"}
    B -->|"Small Arrays n,m < 1000"| C["Standard DP O(n×m) space"]
    B -->|"Large Arrays n,m ≥ 1000"| D["Space-Optimized O(min(n,m)) space"]

    C --> E["Memoization Check"]
    D --> F["Direct Computation"]

    E -->|"Cache Hit"| G["Return Cached"]
    E -->|"Cache Miss"| H["Compute & Cache"]

    G --> I["LCS Matches"]
    H --> I
    F --> I

Standard Dynamic Programming Approach

function computeLCS(a: string[], b: string[]): LCSMatch[] {
  const n = a.length,
    m = b.length;
  const dp = Array.from({ length: n + 1 }, () => Array(m + 1).fill(0));

  // Fill DP table (bottom-up)
  for (let i = n - 1; i >= 0; i--) {
    for (let j = m - 1; j >= 0; j--) {
      if (a[i] === b[j]) {
        dp[i][j] = 1 + dp[i + 1][j + 1]; // Match found
      } else {
        dp[i][j] = Math.max(dp[i + 1][j], dp[i][j + 1]); // Take best
      }
    }
  }

  // Backtrack to find actual matches
  return backtrackMatches(dp, a, b);
}

Space-Optimized Version

For large inputs, switches to O(min(n,m)) space complexity:

function computeLCSSpaceOptimized(a: string[], b: string[]): LCSMatch[] {
  // Ensure 'a' is shorter for optimal space usage
  if (a.length > b.length) {
    return computeLCSSpaceOptimized(b, a).map(([i, j]) => [j, i]);
  }

  let prev = Array(a.length + 1).fill(0);
  let curr = Array(a.length + 1).fill(0);

  // Process row by row, keeping only current and previous
  for (let j = b.length - 1; j >= 0; j--) {
    for (let i = a.length - 1; i >= 0; i--) {
      if (a[i] === b[j]) {
        curr[i] = 1 + prev[i + 1];
      } else {
        curr[i] = Math.max(prev[i], curr[i + 1]);
      }
    }
    [prev, curr] = [curr, prev]; // Swap arrays
  }
}

3. Text Diffing Algorithm

Enhanced Tokenization

Supports complex Unicode and international text:

function tokenize(text: string): string[] {
  // Unicode-aware tokenization with category support
  return text.match(/\p{L}+\p{M}*|\d+|[^\s\p{L}\p{N}]+/gu) || [];
}

Word-Level Diff Generation

flowchart LR
    A[Old Text] --> B[tokenize]
    C[New Text] --> D[tokenize]
    B --> E[Token Arrays]
    D --> E
    E --> F[LCS on Tokens]
    F --> G[Build Diff Sequence]
    G --> H[Merge Consecutive]
    H --> I[Fragment Generation]

    subgraph "Token Types"
        T1[equal: unchanged]
        T2[added: new content]
        T3[removed: deleted content]
    end

Consecutive Token Merging

function mergeConsecutiveTokens(tokens: Token[]): Token[] {
  const merged: Token[] = [];
  let current: Token | null = null;

  for (const token of tokens) {
    if (current && current.type === token.type) {
      // Merge with previous token of same type
      current.text += ' ' + token.text;
    } else {
      if (current) merged.push(current);
      current = { ...token };
    }
  }

  if (current) merged.push(current);
  return merged;
}

compareElements(oldEls: Element[], newEls: Element[], options: CompareOptions)

Purpose: The core diff engine that compares two element arrays and applies visual change indicators.

Algorithm Overview:

flowchart TD
    A[Old Elements] --> B[Similarity Matching]
    C[New Elements] --> B
    B --> D[Paired Elements]
    B --> E[Unmatched Old]
    B --> F[Unmatched New]

    D --> G[compareNode recursion]
    E --> H[Mark as removed]
    F --> I[Mark as added]

    G --> J[DOM with diff annotations]
    H --> J
    I --> J

Detailed Steps:

  1. Similarity-Based Pairing:

    • Uses elementSimilarity() to score potential matches
    • Prefers same-tag matches but allows cross-tag pairing for high similarity
    • Maintains a pool of unmatched elements
  2. Special Handling for Watched Tags:

    • Elements in watchedTags get wrapped when added/removed
    • Use '*' wildcard to watch all HTML tags: watchedTags: ['*']
    • Combines with specific tags: watchedTags: ['*'] watches everything
    • Triggers onElementChange callback for custom handling
  3. Recursive Processing:

    • Paired elements go through compareNode() for deep comparison
    • Unmatched elements get marked as added/removed with appropriate CSS classes

Usage Example:

const oldTree = stringToFlatTree('<div><p>Old text</p></div>');
const newTree = stringToFlatTree('<div><p>New text</p></div>');

compareElements(oldTree.rootElements, newTree.rootElements, {
  addedClass: 'highlight-added',
  removedClass: 'highlight-removed',
  watchedTags: ['img', 'a'], // Watch specific tags
  // watchedTags: ['*'],          // Watch ALL tags (wildcard)
  // watchedTags: ['*', 'div'],   // Watch all tags (redundant example)
  onElementChange: (oldEl, newEl, changeType) => {
    console.log(`${changeType} detected`);
    return null; // use default wrapping
  },
});

collectDiffStats(rootElements: Element[], options: CompareOptions)

Purpose: Analyzes a diffed DOM tree to extract comprehensive change statistics.

Algorithm:

flowchart TD
    A[Diffed DOM Elements] --> B[Recursive Traversal]
    B --> C[Check CSS Classes]
    C --> D[Count Text Changes]
    C --> E[Count Element Changes]
    C --> F[Read data-* attributes]

    F --> G[Extract changed attributes]
    F --> H[Extract tag types]

    D --> I[Update totalAddedTexts/Removed]
    E --> J[Update totalChangedTags]
    G --> K[Update changedTags]
    H --> L[Update addedTags/removedTags]

    I --> M[DiffStats Object]
    J --> M
    K --> M
    L --> M

Statistical Categories:

  • Text-level: Counts wrapped text spans indicating additions/removals
  • Element-level: Counts structural changes (new/removed tags)
  • Attribute-level: Tracks which attributes changed on which tag types
  • Per-tag breakdown: Aggregates all changes by HTML tag type

Usage Example:

// After running compareElements...
const stats = collectDiffStats(diffedElements, options);

console.log(stats);
// Output:
// {
//   totalChangedTags: 3,
//   totalAddedTexts: 5,
//   totalRemovedTexts: 2,
//   addedTags: { img: 2, p: 1 },
//   removedTags: { span: 3 },
//   changedTags: {
//     a: { count: 2, changedAttributes: ['href', 'class'] }
//   }
// }

getCustomDiffStats(oldHTML: string, newHTML: string, options: CompareOptions)

Purpose: High-level convenience function that combines parsing, diffing, and statistics collection.

Workflow:

flowchart LR
    A[Old HTML] --> B[stringToFlatTree]
    C[New HTML] --> D[stringToFlatTree]
    B --> E[compareElements]
    D --> E
    E --> F[collectDiffStats]
    F --> G["diffResult + stats"]

Return Value:

{
  diffResult: {
    rootElements: Element[],    // All root elements from both trees
    allElements: Element[]      // All elements from both trees
  },
  stats: DiffStats              // Comprehensive statistics
}

Usage Example:

const oldHTML = '<div><p>Original content</p></div>';
const newHTML = '<div><p>Modified content</p><img src="new.jpg"></div>';

const { diffResult, stats } = getCustomDiffStats(oldHTML, newHTML, {
  trackedTags: { img: ['src'], p: ['class'] },
  trackedAttributes: ['src', 'class', 'href'],
});

// DOM is now annotated with diff classes
document.body.appendChild(diffResult.rootElements[0]); // old version
document.body.appendChild(diffResult.rootElements[1]); // new version

// Stats show exactly what changed
console.log(`Added ${stats.addedTags?.img || 0} images`);

formatTagStatsSummary(stats: DiffStats)

Purpose: Creates human-readable summary of per-tag statistics for debugging and reporting.

Output Format:

=== PER-TAG DIFF STATISTICS ===

🟢 Added Tags:
  - <img>: 2 element(s)
  - <p>: 1 element(s)

🔴 Removed Tags:
  - <span>: 3 element(s)

🟡 Changed Tags:
  - <a>: 2 element(s)
    Changed attributes: href, class
  - <img>: 1 element(s)
    Changed attributes: src

📊 Totals: 3 added, 3 removed, 3 changed
📝 Text changes: 5 added, 2 removed

Internal Algorithm Functions

compareNode(oldEl: Element, newEl: Element, options: CompareOptions)

Purpose: Recursively compares two matched DOM elements and their children.

Algorithm Steps:

  1. Element-level Change Detection: Calls detectAndWrapElementChange() first
  2. Child Alignment: Uses LCS algorithm to align child nodes optimally
  3. Recursive Processing: Processes matched pairs recursively
  4. Text Diffing: For text nodes, performs word-level diffing

**LCS