next-rag
v1.0.2
Published
Production-ready RAG (Retrieval-Augmented Generation) framework for Next.js applications with PostgreSQL + pgvector
Maintainers
Readme
next-rag (SDK + Sample App)
Production-ready Retrieval-Augmented Generation (RAG) SDK for Next.js with PostgreSQL + pgvector, OpenAI embeddings, hybrid retrieval (vector + sparse), RRF/MMR fusion, optional LLM query rewrite, and zero‑code drop‑in API routes.
1) Install
npm install next-ragEnv vars (for local dev and production):
DATABASE_URL(PostgreSQL)OPENAI_API_KEY- Optional:
OPENAI_MODEL(defaulttext-embedding-3-small)
2) Database prerequisites
Once per database (all three extensions required):
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS pgcrypto;RAGCore.initialize() runs migrations and sets vector dimensions based on model (or vectorDimensions).
3) Quick start (SDK)
import { RAGCore } from 'next-rag/core';
const rag = new RAGCore({
database: { connectionString: process.env.DATABASE_URL! },
embedding: { apiKey: process.env.OPENAI_API_KEY!, model: process.env.OPENAI_MODEL },
chunking: { chunkSize: 800, overlapSize: 120 },
search: { defaultSimilarityThreshold: 0.8 },
pipeline: { enableMonitoring: true },
});
await rag.initialize();
// Ingest
await rag.ingest({
text: '# Title\nContent...',
metadata: { title: 'Doc', source: 'https://example' },
});
// Search
const basic = await rag.search('next.js', { limit: 10 });
// Advanced search (three-stage)
const adv = await rag.searchAdvanced('next.js', {
limit: 10,
vectorLimit: 80,
sparseLimit: 100,
rrfK: 60,
rerankLimit: 80,
mmrLambda: 0.7,
// Sparse tuning (forwarded)
sparseLanguage: 'simple',
sparseTrigramFallback: true,
sparseTrigramThreshold: 0.04,
// Quality gating (forwarded)
minCoverage: 0.4,
minFusedScore: 0.1,
// Reranker (forwarded)
reranker: { enabled: true, topK: 80 },
// Query rewrite (LLM)
rewrite: { enabled: true, model: 'gpt-4o-mini', strategy: 'clarify', timeoutMs: 3500 },
// Debug surfacing
debug: true,
});3.1) Golden Path (copy/paste)
Follow these exact steps in a fresh Next.js app to get a working ingestion + advanced search in minutes.
- Enable required Postgres extensions (one-time):
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS pgcrypto;- Install the SDK in your Next.js project:
npm install next-rag- Generate zero‑code API routes:
npx next-rag init- Create
.env.local:
DATABASE_URL=postgres://user:pass@host:5432/db
OPENAI_API_KEY=sk-...
OPENAI_MODEL=text-embedding-3-small- Start your Next.js app:
npm run dev- Ingest a document (multipart file):
curl -sS -X POST http://localhost:3000/api/documents/ingest \
-F '[email protected]' | jq- Run an advanced search with rewrite + debug telemetry:
curl -sS -X POST http://localhost:3000/api/search/advanced \
-H 'Content-Type: application/json' \
-d '{
"query":"next.js",
"rewrite": {"enabled": true, "model":"gpt-4o-mini", "strategy":"clarify", "timeoutMs": 3500},
"debug": true,
"minCoverage": 0.4,
"minFusedScore": 0.1
}' | jqNotes:
- On managed Postgres, you may need to enable extensions (vector/pg_trgm/pgcrypto) via provider console.
- If rewrite times out in your environment, temporarily set
timeoutMs: 0to confirm functionality, then tune (2.5–4s typical).
4) Zero‑code drop‑in routes
Generate Next.js routes automatically:
npx next-rag initThis writes:
app/api/search/advanced/route.tsapp/api/documents/ingest/route.ts
Or import prebuilt handlers:
// app/api/search/advanced/route.ts
import { createAdvancedSearchHandler } from 'next-rag/nextjs';
export const POST = createAdvancedSearchHandler();
// app/api/documents/ingest/route.ts
import { createIngestHandler } from 'next-rag/nextjs';
export const POST = createIngestHandler();5) REST examples (curl)
Ingest JSON:
curl -X POST http://localhost:3001/api/documents/ingest \
-H 'Content-Type: application/json' \
-d '{"text":"# Title\nBody...","metadata":{"title":"Doc","source":"https://example"}}'Ingest file (multipart):
curl -X POST http://localhost:3001/api/documents/ingest \
-F 'file=@./README.md'Advanced search with rewrite and debug:
curl -sS -X POST http://localhost:3001/api/search/advanced \
-H 'Content-Type: application/json' \
-d '{
"query":"next.js",
"rewrite": {"enabled": true, "model":"gpt-4o-mini", "strategy":"clarify", "timeoutMs": 3000},
"debug": true,
"minCoverage": 0.4,
"minFusedScore": 0.1
}'6) Citations
Each result includes contentId and metadata. Set metadata.source at ingestion for stable citations (URL/filename). Use metadata.title, metadata.headingPath when available.
7) Telemetry & debug
With debug: true, advanced responses can include:
rewrittenQuery,rewriteVariants,effectiveQuery- Vector/sparse telemetry (via example route) for analysis
8) Performance tips
- Create/maintain ANN indexes (IVFFlat/HNSW) and run
ANALYZEafter ingestion. - Keep
vectorLimit/sparseLimittight for latency. - Tune
sparseLanguage='simple'andsparseTrigramThreshold≈0.03–0.06for tech content. - Use reranker only when needed; tune
topK. - Query rewrite: 2–4s timeout is typical; fall back if it times out.
9) Troubleshooting
- Rewrite aborted: raise
timeoutMsor settimeoutMs: 0to test; confirmOPENAI_API_KEY. - Ingest 500 on file: ensure
multipart/form-data(not JSON) for file uploads. - Low recall on dotted terms: trigram fallback + rewrite/variants.
- Confusing scores: calibrate fused score for display (e.g., tanh to 0–100).
10) Sample app
See examples/sample. The README there shows env setup, run commands, curl examples, and how the advanced route forwards rewrite/debug to RAGCore.searchAdvanced.
11) Scripts
npm run build(SDK build)npm run test,npm run lint,npm run typecheck
License
MIT
