npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@lotor/cli

v0.0.1

Published

Code duplicate detection tool

Readme

Code Duplicate Detection Tool

A scalable structural code duplicate detector using a memory-efficient Two-Phase Analysis approach.

📋 Overview

A CLI tool that parses JavaScript/TypeScript files using OXC parser, extracts metadata without keeping ASTs in memory, and identifies duplicates through pattern matching.

Key Goal: Handle large-scale projects without Out of Memory (OOM) issues by separating metadata collection from similarity analysis.

🚀 Quick Start

Prerequisites

  • Node.js >= 20.19.0 or >= 22.12.0
  • pnpm >= 9.0.0

Installation

# Install dependencies
pnpm install

# Build the project
pnpm build

Usage

# Run the CLI
node dist/cli/bin.js 'src/**/*.ts'

# Or use the binary directly (after npm link)
dedupe 'src/**/*.ts'

📁 Project Structure

src/
├── core/           # Core duplicate detection engine
│   ├── context.ts  # GlobalContext implementation
│   └── runner.ts   # Two-Phase analysis orchestrator
├── detectors/      # Built-in detectors
│   └── magic-number/  # Detects duplicate literals
├── reporters/      # Output formatters
├── types/          # Type definitions
├── cli/            # Command-line interface
│   ├── bin.ts      # CLI entry point
│   └── helpers/    # File collection utilities
├── languages/      # Language-specific detectors (to be implemented)
│   ├── js/
│   ├── ts/
│   ├── vue/
│   └── svelte/
└── index.ts        # Main library exports

🔧 Development

Build

# Build the project
pnpm build

# Watch mode
pnpm dev

# Clean build artifacts
pnpm clean

Format

pnpm format

Test

pnpm test
pnpm test:ui
pnpm test:coverage

🏗️ Architecture

Two-Phase Analysis

This architecture is central to how the system works:

Phase 1: Collection (Memory-Efficient)

  1. Parse files using OXC parser
  2. Each detector's Collector visits AST nodes
  3. Extract only essential metadata into GlobalContext
  4. Immediately discard the AST - never keep ASTs in memory
  5. Store metadata in namespaced Maps

Phase 2: Analysis

  1. All detectors' Analyzers process collected metadata
  2. Compare fingerprints across files to find duplicates
  3. Generate Reports with similarity scores and locations

Core Components

  • Runner: Orchestrates both phases synchronously
  • GlobalContext: Centralized metadata storage using nested Maps
  • Detector: Pluggable modules implementing createCollector() and analyze()
  • Collector: AST visitors that extract metadata during Phase 1
  • Reporter: Output formatters (stdout, json, html)

📦 Built-in Detectors

MagicNumberDetector

Finds duplicate literal values (numbers, strings, booleans, bigints)

  • Configuration: minOccurrences (default: 3)
  • Skip Logic: Ignores 0, 1, -1, small integers 2-10, and strings < 3 chars

📚 Documentation

See CLAUDE.md for detailed architecture and implementation guide.

📝 License

ISC