rag-engine

v0.3.0

Published

a month ago

Agentic RAG framework for Node.js — zero dependencies, auto-retries, full decision trace

0High
0Medium
0Low

fix20152

rag agentic retrieval-augmented-generation llm ai embeddings vector-search openai ollama zero-dependency

rag-engine

Agentic RAG framework for Node.js — zero runtime dependencies, auto-retries with query rewriting, full decision trace.

Not basic RAG. Agentic RAG. The agent decides when to search, when to rewrite the query, when to retry, and when to give up honestly.

5-Line Quickstart

import { RagEngine } from 'rag-engine'

const rag = await RagEngine.create()
await rag.ingest('./docs')
const result = await rag.query('How does auth work?')
console.log(result.answer)

Why Not LangChain?

| | rag-engine | LangChain | |---|---|---| | Dependencies | 0 runtime deps | 200+ | | Bundle size | ~50KB | ~5MB | | Agent retries | Built-in | Manual | | Decision trace | Every query | No | | Setup | 5 lines | 50+ lines |

How the Agent Thinks

After every retrieval, an LLM judge evaluates if the chunks answer the question:

| Decision | When | What happens | |----------|------|-------------| | SYNTHESIZE | Relevance >= 0.7 | Chunks are good, generate answer | | REWRITE | Relevance 0.3-0.7 | Chunks are off-topic, rewrite query and retry | | BROADEN | < 3 results | Too few results, broaden query | | GIVE_UP | Max retries or relevance < 0.3 | Honestly say "I don't know" |

Install

npm install rag-engine

Usage

Basic

import { RagEngine } from 'rag-engine'

const rag = await RagEngine.create()       // auto-detects OPENAI_API_KEY
await rag.ingest('./docs')                 // loads, chunks, embeds
const result = await rag.query('How does auth work?')

console.log(result.answer)                 // answer with citations
console.log(result.sources)                // relevant chunks with scores
console.log(result.trace)                  // full agent decision trace
console.log(result.metrics)                // timing, LLM calls

Custom Config

const rag = await RagEngine.create({
  llm: {
    provider: 'openai',
    model: 'gpt-4o',
    temperature: 0.1,
  },
  embeddings: {
    provider: 'openai',
    model: 'text-embedding-3-small',
  },
  agent: {
    maxRetries: 3,
    relevanceThreshold: 0.7,
  },
  chunker: {
    maxTokens: 512,
    overlap: 50,
  },
  retrieval: {
    topK: 10,
  },
})

Ingest Files

await rag.ingest('./docs')                           // all text files
await rag.ingest('./src', { glob: '**/*.ts' })       // TypeScript only
await rag.ingest('./README.md')                      // single file
await rag.ingest('Raw text content to index')        // raw string

Query Response

const result = await rag.query('What is the refund policy?')

// result.answer: "The refund policy allows returns within 30 days..."
// result.sources: [{ id: "policy.md:3", content: "...", score: 0.92, metadata: {...} }]
// result.trace: [
//   { action: "search", query: "What is the refund policy?", resultsCount: 5 },
//   { action: "evaluate", score: 0.89, decision: "synthesize" },
//   { action: "synthesize" }
// ]
// result.metrics: { totalTimeMs: 2340, retrievalTimeMs: 180, llmCalls: 2 }

Express.js API

import express from 'express'
import { RagEngine } from 'rag-engine'

const app = express()
const rag = await RagEngine.create()
await rag.ingest('./docs')

app.use(express.json())
app.post('/ask', async (req, res) => {
  const result = await rag.query(req.body.question)
  res.json(result)
})
app.listen(3000)

CLI

npx rag-engine ingest ./docs
npx rag-engine ingest ./src --glob "**/*.ts"
npx rag-engine query "How does authentication work?"
npx rag-engine stats

Environment

OPENAI_API_KEY=sk-...    # Required for OpenAI LLM + embeddings

Architecture

src/
  core/engine.ts       RagEngine class — wires everything together
  core/agent.ts        Agentic loop (retrieve → judge → decide → retry/answer)
  llm/openai.ts        OpenAI LLM + embeddings via native fetch()
  llm/prompts.ts       All agent prompts (judge, synthesizer)
  stores/memory.ts     In-memory vector store (Map + cosine similarity)
  ingest/loader.ts     File/directory loader
  ingest/chunkers/     Sliding-window chunker

Roadmap

[ ] Ollama provider (free local RAG)
[ ] Anthropic + Gemini providers
[ ] Streaming responses
[ ] SQLite vector store
[ ] Markdown + code-aware chunkers
[ ] Hybrid retrieval (vector + BM25)
[ ] Plugin system
[ ] Built-in evaluation
[ ] npx rag-engine serve (HTTP API)

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

rag-engine

5-Line Quickstart

Why Not LangChain?

How the Agent Thinks

Install

Usage

Basic

Custom Config

Ingest Files

Query Response

Express.js API

CLI

Environment

Architecture

Roadmap

License