@aiconnect/easy-rag
v0.3.1
Published
A TypeScript CLI tool for local RAG indexing and querying
Maintainers
Readme
@aiconnect/easy-rag
A TypeScript CLI for local RAG (Retrieval-Augmented Generation) — index documents and query them with natural language.
Uses OpenAI embeddings and ChromaDB for vector storage. No external infrastructure needed.
Quick Start
Prerequisites: Node.js >= 18 and an OpenAI API key.
# Install
npm install -g @aiconnect/easy-rag
# Set up your API key (interactive)
easy-rag init
# Index a folder and query it
easy-rag index ./my-docs
easy-rag query "What is the refund policy?"Example output:
$ easy-rag index ./my-docs
Scanning ./my-docs...
Found 3 files (2 .md, 1 .pdf)
Chunking documents... 12 chunks created
Generating embeddings... done
Indexed 12 chunks into collection "my-docs"
$ easy-rag query "What is the refund policy?"
Result 1:
Refunds are available within 30 days of purchase.
Contact [email protected] to initiate a refund request.
Result 2:
All subscription plans include a 30-day money-back guarantee...Commands
easy-rag init
Interactive setup — prompts for your OpenAI API key and embedding model. Saves to ~/.easy-rag/config.json.
You can also skip init and set an environment variable instead:
export OPENAI_API_KEY="sk-..."easy-rag index <folder>
Scans a folder recursively and indexes all supported files into ChromaDB.
easy-rag index ./knowledge-base- Supported formats:
.pdf,.md,.csv - Chunking: Automatic — Markdown by heading, PDF by paragraph, CSV by row
- Collection name: Derived from the folder name
- Re-indexing the same folder creates a new collection with a unique suffix
easy-rag query <question>
Search indexed documents using natural language.
easy-rag query "How do I configure the API?"
easy-rag query --top 3 --metadata "quarterly revenue"
easy-rag query --collection my-docs "setup instructions"| Option | Description | Default |
|--------|-------------|---------|
| -t, --top <n> | Number of results | 5 |
| -m, --metadata | Show source file, score, chunk index | off |
| -c, --collection <name> | Search specific collection | all |
easy-rag collections
List all indexed collections with document counts.
easy-rag delete <collection>
Delete a collection. Use --force to skip confirmation.
easy-rag delete my-docs --forceeasy-rag serve
Start ChromaDB in the foreground. Useful if you want a persistent server across multiple index/query calls instead of auto-starting one per command.
easy-rag parse <file>
Parse a single file and print extracted text. Useful for debugging.
How It Works
Documents (.pdf, .md, .csv)
│
▼
┌─────────┐ ┌─────────┐ ┌─────────────┐ ┌──────────┐
│ Parse │ → │ Chunk │ → │ Embed │ → │ Store │
│ │ │ │ │ (OpenAI) │ │ (ChromaDB)│
└─────────┘ └─────────┘ └─────────────┘ └──────────┘
│
Query ─── Embed ───┘
│
▼
Relevant chunks- Parse — Extract text from PDF, Markdown, or CSV
- Chunk — Split intelligently by heading, paragraph, or row
- Embed — Generate vectors via OpenAI (
text-embedding-3-large) - Store — Save in ChromaDB (runs locally, auto-started)
- Query — Embed the question and find the closest chunks
Configuration
API Key
| Method | How | Priority |
|--------|-----|----------|
| easy-rag init | Interactive prompt, saved to ~/.easy-rag/config.json | 2nd |
| Environment variable | export OPENAI_API_KEY="sk-..." | 1st (overrides config) |
Embedding Model
Set during easy-rag init or via EMBEDDING_MODEL env var.
| Model | Notes |
|-------|-------|
| text-embedding-3-large | Default, best performance |
| text-embedding-3-small | Faster, cheaper |
| text-embedding-ada-002 | Legacy |
ChromaDB
ChromaDB is the local vector store. It starts automatically — no setup needed.
- Data is stored in
~/.easy-rag/chromadb/(auto-created on first use) - To connect to an external instance:
export CHROMA_URL="http://my-chroma-host:8000"
Development
git clone https://github.com/johnjohn-aic/easy-rag.git
cd easy-rag
npm install
npm run build # Compile TypeScript
npm test # Run tests
npx tsx src/index.ts <command> # Dev mode (no build needed)Project Structure
src/
index.ts # CLI entrypoint (commander)
parsers/ # PDF, Markdown, CSV text extraction
chunker/ # Content-aware text splitting
embeddings/ # OpenAI embedding generation
vector-store/ # ChromaDB client + server management
query/ # Vector similarity search
indexer/ # Orchestrates parse → chunk → embed → store
config/ # Global config (~/.easy-rag/config.json)
commands/ # CLI command implementations (init, serve)License
MIT
