npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@bvvvp009/semcode

v0.1.1

Published

Semantic code search CLI - find code by meaning, not just text patterns

Downloads

24

Readme

semcode - Semantic Code Search CLI

A semantic, grep-like search tool for code that understands natural language queries. Use grep for exact matches, semcode for semantic understanding.


About

semcode is a local-first semantic code search tool that helps you find code by meaning, not just text patterns. It intelligently routes queries:

  • Simple/exact queries → Use grep (fast, instant results)
  • Complex/semantic queries → Use semcode (token savings, better relevance)

Key Features:

  • 🔍 Semantic search with natural language queries
  • 💰 82% token savings vs grep on complex queries
  • ⚡ Fast indexing with local embeddings
  • 🎯 Intelligent tool selection (grep vs semcode)
  • 🔒 Works entirely offline, no cloud required

Quick Start

1. Install

# Option 1: Install globally (when published)
npm install -g @bvvvp009/semcode

# Option 2: Build from source
cd /path/to/semcode
npm install
npm run build

2. Initialize in Your Project

cd /path/to/your-project

# One command: indexes files + sets up Cursor rules
semcode init

That's it! The init command:

  • ✅ Indexes your workspace files
  • ✅ Creates .cursor/rules/semcode-search.mdc with intelligent routing rules
  • ✅ Configures Cursor to automatically route queries to grep (simple) or semcode (complex)

3. Restart Cursor

Close and reopen Cursor to load the new rules. Cursor agents will now automatically validate queries and route them to the correct tool:

  • Simple/exact queriesgrep (fast, low tokens)
  • Complex/semantic queriessemcode (82% token savings, reduces cache reads)

This validation reduces cache reading in large projects by 82%+, significantly cutting costs.


Commands

semcode init

Initialize workspace: index files and setup Cursor rules.

semcode init              # Index + setup rules
semcode init --clear      # Clear existing index first
semcode init --skip-index # Only setup rules, skip indexing

semcode index or semcode index-local

Re-index your workspace (use when files change significantly).

semcode index                    # Re-index files (alias for index-local)
semcode index-local              # Re-index files (local, no cloud)
semcode index --clear            # Clear existing index first
semcode index-local --clear      # Clear existing index first
semcode index --omit dist build  # Exclude additional folders (node_modules excluded by default)

Default exclusions:

  • node_modules/ - Dependencies (excluded by default)
  • .git/ - Git metadata (excluded by default)
  • dist/, build/ - Build outputs (excluded by default)
  • Files matching: *.lock, *.bin, *.ipynb, *.pyc, *.pyo

Note: All subfolders are indexed recursively. Use --omit to exclude additional paths.

semcode watch-local

Automatically watch for file changes and update the index in real-time.

semcode watch-local              # Watch and auto-index file changes
semcode watch-local --omit dist  # Watch with additional exclusions

Features:

  • ✅ Automatically indexes files when they change
  • ✅ Watches all subfolders recursively (except excluded paths)
  • ✅ Real-time updates - no manual re-indexing needed
  • ✅ Excludes node_modules/ by default
  • ✅ Press Ctrl+C to stop watching

How It Works

After running semcode init, Cursor agents validate queries and automatically route to the correct tool:

Validation Process (Automatic)

Before each search, agents:

  1. Count words in query
  2. Check for question words (how, what, where, why, when, which)
  3. Identify query type (exact vs semantic)
  4. Route to appropriate tool

Simple Queries → grep

  • Exact matches: authenticateUser, const API_KEY
  • Short queries (< 10 words, no questions)
  • Regex patterns
  • Debugging exact strings

Example:

User: "find authenticateUser"
Agent: grep -r "authenticateUser" src/

Complex Queries → semcode

  • Natural language: "how is authentication implemented?"
  • Long queries (≥ 10 words)
  • Question words: how, what, where, why
  • Architecture/pattern exploration

Example:

User: "how is user authentication and authorization implemented?"
Agent: Uses semcode search internally (configured via rules)

Cache Reduction & Cost Savings

Without validation (wrong tool selection):

  • Semantic queries with grep → 3,500+ tokens (multiple file reads, cache exhaustion)
  • Cost: $1,054.75/month (500 sessions, 50 queries each)

With validation (correct tool selection):

  • Semantic queries with semcode → 750 tokens (targeted results, minimal cache reads)
  • Cost: $189.00/month (500 sessions, 50 queries each)
  • Savings: $865.75/month (82% reduction)

The rules enforce agents to use semcode for semantic queries instead of reading entire files, drastically reducing cache reads in large projects.


Benchmarks

Token Savings: 82% Reduction

We tested 8 difficult semantic queries comparing grep vs semcode:

| Query Type | grep Tokens | semcode Tokens | Savings | |------------|-------------|--------------|---------| | Error Handling | ~4,500 | ~746 | -83.4% | | Authentication | ~5,500 | ~760 | -86.2% | | API Routes | ~8,750 | ~760 | -91.3% | | State Management | ~3,000 | ~760 | -74.7% | | File Processing | ~5,000 | ~760 | -84.8% | | Performance | ~2,000 | ~760 | -62.0% | | Configuration | ~2,750 | ~755 | -72.5% | | Security | ~2,250 | ~750 | -66.7% | | TOTAL | ~33,750 | ~6,051 | -82.1% |

Key Findings

Token Savings:

  • grep average: ~4,219 tokens per complex query
  • semcode average: ~756 tokens per complex query
  • Savings per query: ~3,463 tokens (82%)

Performance:

  • grep: ~57ms average (but needs multiple queries + filtering)
  • semcode: ~940ms average (single semantic query)
  • Trade-off: semcode is ~16x slower but returns 82% fewer, more relevant tokens

Relevance:

  • grep: ~17% relevant results (many false positives, requires reading multiple files)
  • semcode: ~80% relevant results (semantic understanding, targeted results)
  • semcode finds 4x more useful information per result

Cache Reduction:

  • grep (semantic): Reads 20+ files to find patterns → High cache usage
  • semcode (semantic): Returns top 10 relevant results → Minimal cache usage
  • 82% reduction in file reads and cache consumption in large projects

Real-World Cost Analysis

Scenario: AI Agent Session (50 semantic queries)

  • grep: ~211,000 tokens = $2.11 (at $0.01/1K tokens)
  • semcode: ~38,000 tokens = $0.38
  • Savings: $1.73 per session (82% reduction)

Monthly Usage (500 sessions):

  • grep: $1,054.75
  • semcode: $189.00
  • Savings: $865.75/month

When to Use Each Tool

Use grep when:

  • ✅ You know the exact pattern/symbol
  • ✅ You need speed (instant results)
  • ✅ You want ALL matches (comprehensive search)
  • ✅ Searching for exact strings, regex patterns
  • ✅ Debugging specific issues (exact error messages)

Example:

# Perfect for grep - exact symbol
grep -r "authenticateUser" src/

# Perfect for grep - regex pattern
grep -r "error.*code.*[0-9]{3}" src/

Use semcode when:

  • ✅ You're exploring unfamiliar codebase
  • ✅ You need semantic understanding
  • ✅ You want TOP relevant results (not all matches)
  • ✅ Token usage matters (AI agents, API costs)
  • ✅ You don't know exact naming conventions

Example:

# Perfect for semcode - semantic understanding
# Cursor agents automatically use semcode for queries like:
# "how is user authentication implemented"
# "how are API endpoints structured"

File Structure

After running semcode init:

your-project/
├── .cursor/
│   └── rules/
│       └── semcode-search.mdc # Cursor rules (auto-created)
├── .semcode/
│   ├── local-index.json       # Search index
│   └── .lock                  # Lock file
└── ...

Keeping Index Updated

Re-index your workspace when files change significantly:

semcode index           # Re-index files
semcode index --clear   # Clear and re-index

Troubleshooting

semcode command not found

Solution:

  • Use full path: /path/to/semcode/dist/index.js init
  • Or install globally: npm install -g @bvvvp009/semcode

Index not found

Solution: Run semcode index, semcode index-local, or semcode init

Cursor not using semcode

Solution:

  1. Check .cursor/rules/semcode-search.mdc exists
  2. Restart Cursor
  3. Verify query complexity (should be ≥ 10 words or contain questions)

Index out of date

Solution: Run semcode index or semcode index-local to re-index


How It Works Under the Hood

  1. Indexing: Files are chunked and embedded using local Transformers.js models
  2. Search: Queries are embedded and matched against indexed chunks using cosine similarity
  3. Routing: Cursor rules analyze query complexity and route to appropriate tool
  4. Storage: Everything is stored locally in .semcode/local-index.json

No cloud required - all processing happens locally on your machine.


License

Apache 2.0


Contributing

Contributions welcome! Please open an issue or PR.


Built with ❤️ for developers who want smarter code search