npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@someshtalligeridev/pika-cli

v1.0.6

Published

PIKA - Pattern Inspection & Knowledge Analyzer. CLI for code-similarity detection, plagiarism checking, and duplicate analysis using Rabin-Karp rolling hash with a native C engine.

Readme


🚀 Installation

npm install -g pika-cli

Requirements:

  • Node.js ≥ 18
  • C compiler (cc/gcc/clang) in PATH — auto-compiles on first run

✨ Features

| Feature | Description | |---------|-------------| | 🔍 Code Similarity Scan | Detect duplicate code blocks across entire projects | | 🕵️ Plagiarism Detection | Check a file against a corpus with visual verdict | | 🐙 GitHub Repo Scanner | Clone and analyze any public repository | | 📁 Duplicate File Finder | Find same-named files/images across directories | | 👁️ Live Watch Mode | Re-scan automatically on file changes | | 📊 Side-by-Side Diff | Visual comparison of two files | | 🐚 Interactive Shell | Persistent REPL with tab completion and history | | ⚡ C Engine Backend | 78-line Rabin-Karp in C, auto-compiled on first run | | 📈 Visual Analytics | Heatmaps, risk meters, complexity panels |


📦 Quick Start

# Launch interactive shell
pika

# Scan a project
pika scan ./src

# Compare two files
pika compare file1.js file2.js

# Check for plagiarism
pika scan ./homework.js
# then inside shell:
plagiarism ./homework.js ./originals/

# Scan a GitHub repo
# inside shell:
github user/repo

🛠️ Commands

| Command | Description | |---------|-------------| | scan <path> | Recursive duplicate code detection | | compare <a> <b> | Side-by-side diff + similarity score | | plagiarism <file> <corpus> | Plagiarism check with visual verdict | | github <url> | Clone & scan a GitHub repository | | duplicates <path> | Find duplicate file/image names | | watch <path> | Live re-scan on file changes | | paste | Paste code → press Enter on empty line → analyze | | report [--format json\|txt\|html] | Export results to disk | | status | Current session stats | | help | Show all commands |


🧠 Algorithm

PIKA uses the Rabin-Karp rolling hash algorithm implemented in C (78 lines):

hash(chunk) = Σ char[i] · 257^(n-i-1) mod 10^9+7

Pipeline

① File Collection    O(d)    Walk directory tree
② Normalization      O(n)    Strip comments (language-aware)
③ Chunking           O(n)    5-line sliding windows
④ Hashing (C)        O(n)    Polynomial rolling hash
⑤ Collision Detect   O(k²)   Hash table lookup + strcmp verify
⑥ Pair Scoring       O(p)    similarity = shared/min(chunks) × 100

Complexity

| | Average | Worst | |---|---|---| | Time | O(n × m) | O(n × m) | | Space | O(n) | O(n) |

n = total lines across all files, m = number of files


📊 Visual Output

PIKA provides rich terminal visuals:

  • ⚠️ Risk Meter — LOW / MEDIUM / HIGH / CRITICAL gauge
  • ▓ Heatmap — Color-coded duplicate density per file
  • ① Pipeline — Step-by-step algorithm complexity breakdown
  • █ Similarity Bars — Visual percentage indicators
  • ✓/✗ Verdict — ORIGINAL / SUSPICIOUS / PLAGIARIZED

🔌 API

Use PIKA programmatically:

import { analyze, compareTwo } from 'pika-cli/similarity';
import { collectFiles, readFilesContent } from 'pika-cli/scanner';

// Scan a directory
const paths = await collectFiles('./src');
const files = await readFilesContent(paths);
const result = analyze(files, { chunkSize: 5, minSimilarity: 10 });

console.log(result.summary);
// { totalFiles: 25, totalChunks: 2884, duplicateChunks: 36, pairCount: 20, ... }

// Compare two files
const pair = compareTwo('a.js', sourceA, 'b.js', sourceB);
console.log(pair.similarity); // 42.5

⚙️ Configuration

Create .pikaignore in your project root (same syntax as .gitignore):

dist/
*.min.js
vendor/
node_modules/

CLI Flags

pika scan ./src --threshold 20    # Only show pairs ≥ 20% similar
pika scan ./src --chunk 3         # Use 3-line windows (more granular)
pika scan ./src --top 5           # Show top 5 pairs only

🔒 Security

| Protection | Implementation | |---|---| | Command Injection | spawnSync with array args (no shell) | | Path Traversal | Root boundary check + symlink rejection | | URL Validation | Regex-validated before git clone | | Resource Limits | 120s timeout, 64MB buffer cap | | Dependency Safety | No eval, no prototype pollution vectors |


🏗️ Architecture

pika-cli/
├── src/
│   ├── algorithms/
│   │   ├── pika_rk.c          # C Rabin-Karp engine (78 lines)
│   │   └── rabinKarp.js       # JS wrapper (compile + spawn)
│   ├── commands/
│   │   ├── scan.js            # Directory scanning
│   │   ├── compare.js         # File comparison
│   │   ├── plagiarism.js      # Plagiarism detection
│   │   ├── github.js          # GitHub repo scanner
│   │   ├── duplicates.js      # Duplicate filename finder
│   │   ├── watch.js           # Live file watcher
│   │   ├── paste.js           # Inline paste analysis
│   │   └── report.js          # Export results
│   ├── core/
│   │   ├── similarity.js      # Scoring engine
│   │   ├── scanner.js         # File system walker
│   │   └── session.js         # Session state
│   ├── ui/
│   │   ├── header.js          # ASCII art + image
│   │   ├── renderer.js        # Visual components
│   │   ├── statusBar.js       # Live status
│   │   ├── banner.js          # Welcome screen
│   │   └── diffView.js        # Side-by-side diff
│   ├── utils/
│   │   ├── langDetect.js      # Language-aware normalization
│   │   ├── fileFilter.js      # Ignore rules
│   │   ├── logger.js          # Colored logging
│   │   └── timer.js           # Performance timing
│   ├── shell/
│   │   └── interactiveShell.js # REPL with history
│   └── index.js               # Entry point
├── assets/
│   └── pika.png               # Mascot image
├── tests/
│   ├── run-tests.mjs          # 10 test cases
│   └── fixtures/              # Test data
├── .github/workflows/ci.yml   # CI/CD pipeline
└── package.json

🧪 Testing

# Run all 10 test cases
npm test

# Test cases cover:
# TC-01: Normal — shared code blocks
# TC-02: Normal — unrelated files
# TC-03: Edge — empty corpus
# TC-04: Edge — single file
# TC-05: Edge — identical files
# TC-06: Edge — internal duplicates
# TC-07: Extreme — 1000 files × 50 lines
# TC-08: Extreme — 100 identical files
# TC-09: Extreme — 100 unique files
# TC-10: Extreme — Unicode + comment stripping

🤝 Contributing

git clone https://github.com/SomeshTalligeriDEV/pika-cli--daa.git
cd pika-cli--daa
npm install
npm test
node src/index.js

PRs welcome! Please ensure all 10 tests pass before submitting.


📄 License

MIT © Somesh S Talligeri