npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

@masuidrive/bloom-local-rag

v0.3.3

Published

RAG (Retrieval-Augmented Generation) system for local directories - Index and search your documents with AI-powered answers

Readme

@masuidrive/bloom-local-rag

🇯🇵 Japanese version

RAG (Retrieval-Augmented Generation) system for local directories. This tool enables semantic search across your local documents with AI-powered answers, without requiring a daemon process.

What is bloom-local-rag?

bloom-local-rag is a command-line tool that brings the power of RAG to your local files. It creates a vector database from your documents and uses Large Language Models (LLMs) to provide accurate, context-aware answers based on your actual content.

Key Features

  • 🔍 Semantic Search: Find information based on meaning, not just keywords
  • 🤖 AI-Powered Answers: Get contextual answers generated from your documents
  • 📁 Multiple File Types: Supports Markdown, code files (JS/TS), YAML, and more
  • 🔄 Smart Indexing: Automatically updates index when files change
  • 🚀 No Daemon Required: Runs on-demand without background processes
  • 💾 Efficient Storage: Uses LanceDB for fast vector operations
  • 🌐 Multi-Provider Support: Works with both Google Gemini and OpenAI

Installation

No installation required! Use directly with npx:

npx @masuidrive/bloom-local-rag

Or install globally:

npm install -g @masuidrive/bloom-local-rag

Quick Start

1. Set up your API key

For Google Gemini (recommended):

export GOOGLE_API_KEY=your-api-key
# or
export GEMINI_API_KEY=your-api-key

For OpenAI:

export OPENAI_API_KEY=your-api-key

2. Initialize your directory

npx @masuidrive/bloom-local-rag --init

This creates a .bloom-local-rag directory with the vector database.

3. Search your documents

npx @masuidrive/bloom-local-rag "how do I authenticate users?"

# Search a specific directory
npx @masuidrive/bloom-local-rag "how do I authenticate users?" --dir /path/to/docs

# Or use --directory
npx @masuidrive/bloom-local-rag --directory /path/to/docs "how do I authenticate users?"

Detailed Usage

Initialize Command

The --init option scans your directory and creates a searchable index:

npx @masuidrive/bloom-local-rag --init [options]

Options:

  • -d, --dir, --directory <path>: Directory to initialize (default: current directory)
  • -e, --extensions <exts...>: File extensions to index (default: .md, .mdx, .txt, .js, .ts, .jsx, .tsx, .yaml, .yml)
  • --chunk-size <size>: Text chunk size for indexing (default: 1000)
  • --chunk-overlap <size>: Overlap between chunks (default: 200)
  • --embedding-provider <provider>: Choose between 'gemini' or 'openai' (default: gemini)
  • --embedding-model <model>: Specific embedding model to use
  • --llm-provider <provider>: LLM provider for answers (default: gemini)
  • --llm-model <model>: Specific LLM model to use
  • --exclude <patterns...>: Additional patterns to exclude from indexing

Example:

# Initialize current directory
npx @masuidrive/bloom-local-rag --init

# Initialize a specific directory
npx @masuidrive/bloom-local-rag --init --dir ./docs

# Index only markdown and TypeScript files
npx @masuidrive/bloom-local-rag --init --extensions .md .ts

# Use OpenAI for both embeddings and answers
npx @masuidrive/bloom-local-rag --init --embedding-provider openai --llm-provider openai

Query Command (Default Mode)

Search your indexed documents and get AI-powered answers:

npx @masuidrive/bloom-local-rag "your question here" [options]

Options:

  • -d, --dir, --directory <path>: Directory to search (default: current directory)
  • -l, --limit <n>: Number of source documents to retrieve (default: 5)
  • --no-context: Skip AI answer generation, show only source documents
  • --json: Output results in JSON format
  • --temperature <value>: Control creativity of AI answers (0-2, default: 0.7)
  • -v, --verbose: Show detailed information including sources

Examples:

# Simple query
npx @masuidrive/bloom-local-rag "how to handle errors in async functions"

# Get more source documents
npx @masuidrive/bloom-local-rag "database schema design" --limit 10

# Get only relevant documents without AI summary
npx @masuidrive/bloom-local-rag "API endpoints" --no-context

# Get JSON output for integration with other tools
npx @masuidrive/bloom-local-rag "user authentication" --json

# Search in a different directory
npx @masuidrive/bloom-local-rag "deployment process" --dir ../other-project

# Directory option can come before or after the query
npx @masuidrive/bloom-local-rag --dir ../docs "deployment process"

Reindex Command

Manually update the index (though this happens automatically during queries):

npx @masuidrive/bloom-local-rag --reindex [options]

Options:

  • -d, --dir, --directory <path>: Directory to reindex (default: current directory)
  • --force: Force reindex all files, ignoring cache
  • -v, --verbose: Show detailed information

Status Command

Check the status of your indexed documents:

npx @masuidrive/bloom-local-rag --status [options]

Options:

  • -d, --dir, --directory <path>: Directory to check status (default: current directory)

Shows:

  • Configuration details
  • Number of indexed files and chunks
  • Last index update time
  • Storage usage

How It Works

  1. Indexing: bloom-local-rag scans your directory and splits documents into chunks
  2. Embedding: Each chunk is converted to a vector embedding using AI models
  3. Storage: Vectors are stored in a local LanceDB database
  4. Search: Your query is converted to a vector and compared with stored vectors
  5. Context: Most relevant chunks are retrieved as context
  6. Answer: An LLM generates an answer based on the retrieved context

File Type Support

By default, bloom-local-rag indexes:

  • Documentation: .md, .mdx, .txt
  • Code: .js, .ts, .jsx, .tsx
  • Configuration: .yaml, .yml

Special handling:

  • Markdown files: Frontmatter is extracted as metadata
  • YAML files: Parsed for structured data
  • .gitignore: Respected for file exclusion

Best Practices

  1. Choose the Right Files: Focus on documentation, well-commented code, and configuration files
  2. Chunk Size: Larger chunks (2000) for narrative docs, smaller (500) for code
  3. Exclusions: Exclude generated files, build outputs, and dependencies
  4. API Keys: Use environment variables, never commit keys to version control
  5. Regular Updates: Run queries regularly - the index updates automatically

Configuration

The .bloom-local-rag/config.json file stores your settings:

{
  "version": "1.0",
  "directory": "/path/to/your/project",
  "extensions": [".md", ".js", ".ts"],
  "embedding": {
    "provider": "gemini",
    "model": "text-embedding-004",
    "chunkSize": 1000,
    "chunkOverlap": 200
  },
  "llm": {
    "provider": "gemini",
    "model": "gemini-2.0-flash-exp",
    "temperature": 0.7
  }
}

Troubleshooting

API Key Issues

Error: Gemini API key not found

Solution: Set the appropriate environment variable:

  • For Gemini: export GOOGLE_API_KEY=your-key
  • For OpenAI: export OPENAI_API_KEY=your-key

Directory Not Initialized

Error: Directory not initialized. Run "init" command first.

Solution: Run npx @masuidrive/bloom-local-rag init in your project directory

No Results Found

  • Check if files match the configured extensions
  • Verify files aren't excluded by .gitignore
  • Try broader search terms
  • Increase the --limit parameter

Privacy & Security

  • Local Processing: All data stays on your machine
  • No Telemetry: We don't collect any usage data
  • API Calls: Only your queries and relevant chunks are sent to the LLM provider
  • Gitignore: Sensitive files excluded by default

Requirements

  • Node.js v20 or higher
  • API key for Google Gemini or OpenAI

License

MIT


Important Note: When modifying this README, please also update the Japanese translation in README.ja.md to maintain consistency across both versions.