next-rag

v1.0.2

Published

4 months ago

Production-ready RAG (Retrieval-Augmented Generation) framework for Next.js applications with PostgreSQL + pgvector

0High
0Medium
0Low

crazyrabbitltc

nextjs rag retrieval-augmented-generation embeddings semantic-search vector-database postgresql pgvector vercel

next-rag (SDK + Sample App)

Production-ready Retrieval-Augmented Generation (RAG) SDK for Next.js with PostgreSQL + pgvector, OpenAI embeddings, hybrid retrieval (vector + sparse), RRF/MMR fusion, optional LLM query rewrite, and zero‑code drop‑in API routes.

1) Install

npm install next-rag

Env vars (for local dev and production):

DATABASE_URL (PostgreSQL)
OPENAI_API_KEY
Optional: OPENAI_MODEL (default text-embedding-3-small)

2) Database prerequisites

Once per database (all three extensions required):

CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS pgcrypto;

RAGCore.initialize() runs migrations and sets vector dimensions based on model (or vectorDimensions).

3) Quick start (SDK)

import { RAGCore } from 'next-rag/core';

const rag = new RAGCore({
  database: { connectionString: process.env.DATABASE_URL! },
  embedding: { apiKey: process.env.OPENAI_API_KEY!, model: process.env.OPENAI_MODEL },
  chunking: { chunkSize: 800, overlapSize: 120 },
  search: { defaultSimilarityThreshold: 0.8 },
  pipeline: { enableMonitoring: true },
});
await rag.initialize();

// Ingest
await rag.ingest({
  text: '# Title\nContent...',
  metadata: { title: 'Doc', source: 'https://example' },
});

// Search
const basic = await rag.search('next.js', { limit: 10 });

// Advanced search (three-stage)
const adv = await rag.searchAdvanced('next.js', {
  limit: 10,
  vectorLimit: 80,
  sparseLimit: 100,
  rrfK: 60,
  rerankLimit: 80,
  mmrLambda: 0.7,
  // Sparse tuning (forwarded)
  sparseLanguage: 'simple',
  sparseTrigramFallback: true,
  sparseTrigramThreshold: 0.04,
  // Quality gating (forwarded)
  minCoverage: 0.4,
  minFusedScore: 0.1,
  // Reranker (forwarded)
  reranker: { enabled: true, topK: 80 },
  // Query rewrite (LLM)
  rewrite: { enabled: true, model: 'gpt-4o-mini', strategy: 'clarify', timeoutMs: 3500 },
  // Debug surfacing
  debug: true,
});

3.1) Golden Path (copy/paste)

Follow these exact steps in a fresh Next.js app to get a working ingestion + advanced search in minutes.

Enable required Postgres extensions (one-time):

CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE EXTENSION IF NOT EXISTS pgcrypto;

Install the SDK in your Next.js project:

npm install next-rag

Generate zero‑code API routes:

npx next-rag init

Create .env.local:

DATABASE_URL=postgres://user:pass@host:5432/db
OPENAI_API_KEY=sk-...
OPENAI_MODEL=text-embedding-3-small

Start your Next.js app:

npm run dev

Ingest a document (multipart file):

curl -sS -X POST http://localhost:3000/api/documents/ingest \
  -F '[email protected]' | jq

Run an advanced search with rewrite + debug telemetry:

curl -sS -X POST http://localhost:3000/api/search/advanced \
  -H 'Content-Type: application/json' \
  -d '{
    "query":"next.js",
    "rewrite": {"enabled": true, "model":"gpt-4o-mini", "strategy":"clarify", "timeoutMs": 3500},
    "debug": true,
    "minCoverage": 0.4,
    "minFusedScore": 0.1
  }' | jq

Notes:

On managed Postgres, you may need to enable extensions (vector/pg_trgm/pgcrypto) via provider console.
If rewrite times out in your environment, temporarily set timeoutMs: 0 to confirm functionality, then tune (2.5–4s typical).

4) Zero‑code drop‑in routes

Generate Next.js routes automatically:

npx next-rag init

This writes:

app/api/search/advanced/route.ts
app/api/documents/ingest/route.ts

Or import prebuilt handlers:

// app/api/search/advanced/route.ts
import { createAdvancedSearchHandler } from 'next-rag/nextjs';
export const POST = createAdvancedSearchHandler();

// app/api/documents/ingest/route.ts
import { createIngestHandler } from 'next-rag/nextjs';
export const POST = createIngestHandler();

5) REST examples (curl)

Ingest JSON:

curl -X POST http://localhost:3001/api/documents/ingest \
  -H 'Content-Type: application/json' \
  -d '{"text":"# Title\nBody...","metadata":{"title":"Doc","source":"https://example"}}'

Ingest file (multipart):

curl -X POST http://localhost:3001/api/documents/ingest \
  -F 'file=@./README.md'

Advanced search with rewrite and debug:

curl -sS -X POST http://localhost:3001/api/search/advanced \
  -H 'Content-Type: application/json' \
  -d '{
    "query":"next.js",
    "rewrite": {"enabled": true, "model":"gpt-4o-mini", "strategy":"clarify", "timeoutMs": 3000},
    "debug": true,
    "minCoverage": 0.4,
    "minFusedScore": 0.1
  }'

6) Citations

Each result includes contentId and metadata. Set metadata.source at ingestion for stable citations (URL/filename). Use metadata.title, metadata.headingPath when available.

7) Telemetry & debug

With debug: true, advanced responses can include:

rewrittenQuery, rewriteVariants, effectiveQuery
Vector/sparse telemetry (via example route) for analysis

8) Performance tips

Create/maintain ANN indexes (IVFFlat/HNSW) and run ANALYZE after ingestion.
Keep vectorLimit/sparseLimit tight for latency.
Tune sparseLanguage='simple' and sparseTrigramThreshold≈0.03–0.06 for tech content.
Use reranker only when needed; tune topK.
Query rewrite: 2–4s timeout is typical; fall back if it times out.

9) Troubleshooting

Rewrite aborted: raise timeoutMs or set timeoutMs: 0 to test; confirm OPENAI_API_KEY.
Ingest 500 on file: ensure multipart/form-data (not JSON) for file uploads.
Low recall on dotted terms: trigram fallback + rewrite/variants.
Confusing scores: calibrate fused score for display (e.g., tanh to 0–100).

10) Sample app

See examples/sample. The README there shows env setup, run commands, curl examples, and how the advanced route forwards rewrite/debug to RAGCore.searchAdvanced.

11) Scripts

npm run build (SDK build)
npm run test, npm run lint, npm run typecheck

License

MIT