rag-api-kit
v1.0.0
Published
Backend toolkit for building secure, scalable Retrieval-Augmented Generation (RAG) APIs
Maintainers
Readme
rag-api-kit
Production-grade backend toolkit for building secure, scalable Retrieval-Augmented Generation (RAG) APIs.
- TypeScript — Written in TypeScript with full typings.
- Modular — Embedding, vector store, cache, and security are pluggable.
- DI-friendly — Inject your own providers via interfaces.
- Production-oriented — Error handling, metrics, no demo shortcuts.
Install
npm install rag-api-kitPeer / runtime: openai and ioredis (only if using Redis cache). Install them in your app:
npm install openai ioredisQuick start
import { createRagKit } from "rag-api-kit";
const rag = createRagKit({
embedding: { provider: "openai", apiKey: process.env.OPENAI_API_KEY! },
vectorStore: { type: "memory" },
retrieval: { topK: 5, minScore: 0.7 },
chunk: { size: 100, overlap: 20 },
security: { strict: true },
});
await rag.ingest([
"Your document text here.",
{ text: "Another doc.", metadata: { source: "api" } },
]);
const result = await rag.ask("Your question?");
console.log(result.answer); // Assembled context
console.log(result.context); // Ranked chunks
console.log(result.metrics); // retrievalTimeMs, cacheHits, etc.Configuration
| Key | Required | Description |
|-----|----------|-------------|
| embedding | Yes* | { provider: "openai", apiKey, model? } or supply embeddingProvider |
| vectorStore | Yes | { type: "memory" } or inject a VectorStore |
| retrieval | Yes | { topK, minScore?, maxContextTokens? } |
| chunk | Yes | { size, overlap } (token-based chunking) |
| cache | No | { redisUrl, ttlSeconds?, similarityThreshold? } for Redis semantic cache |
| security | No | { strict: true } enables prompt-injection filter |
* Either embedding (with provider: "openai") or embeddingProvider must be provided.
Features
Auto chunking
- Token-based chunking with configurable
sizeandoverlap. - Metadata is preserved on each chunk.
import { createChunker } from "rag-api-kit";
const chunker = createChunker({ size: 100, overlap: 20 });
const chunks = chunker.chunk(["Long document...", { text: "Doc with meta.", metadata: { id: 1 } }]);Embedding wrapper
- Interface:
EmbeddingProvider(embed,embedBatch,dimensions). - Built-in:
OpenAIEmbeddingProvider(OpenAI embeddings). - You can implement the interface for other providers.
Vector store
- Interface:
VectorStore(add,search,clear,size). - Built-in: in-memory store with cosine similarity,
topK,minScore, and metadata filters.
Redis semantic cache (optional)
- Cache by embedding similarity: similar queries return cached response.
- Configurable TTL and similarity threshold.
- Enable by passing
cache: { redisUrl, ttlSeconds?, similarityThreshold? }.
Prompt injection filter
- Detects patterns such as “ignore previous instructions”, “reveal system prompt”, “act as system”, and prompt override attempts.
- Configurable strict mode; returns structured
SecurityErrorwithreasonandpattern.
import { createPromptInjectionFilter } from "rag-api-kit";
const filter = createPromptInjectionFilter({ strict: true });
const result = filter.check(userInput);
if (!result.safe) throw new SecurityError(result.reason!, result.pattern, result.reason);Retriever pipeline
- Embeds query → searches vector store → assembles context → enforces
maxContextTokenswhen set. - Returns ranked chunks (scores from cosine similarity).
Metrics
result.metrics:retrievalTimeMs,embeddingTimeMs,totalTimeMs,cacheHits,cacheMisses.rag.stats():documentCount,chunkCount,cacheHits,cacheMisses.
API summary
new RagKit(config)/createRagKit(config)— Build the RAG pipeline.ingest(documents)— Chunk, embed, and store.documents:string[]or{ text, metadata? }[].ask(query, options?)— Security check → optional cache → retrieve → return{ answer, context, fromCache, metrics }.options:{ skipCache?, skipSecurityCheck? }.stats()—{ documentCount, chunkCount, cacheHits, cacheMisses }.clearCache()— Clears semantic cache and resets metrics.
Dependency injection
You can inject your own implementations:
createRagKit({
embedding: { provider: "openai", apiKey: "" },
embeddingProvider: myEmbeddingProvider,
vectorStore: myVectorStore,
retrieval: { topK: 5 },
chunk: { size: 50, overlap: 10 },
security: { strict: true },
});Implement:
EmbeddingProvider—embed(text),embedBatch(texts),dimensions().VectorStore—add(docs),search(vector, options),clear(),size().SecurityFilter—check(input)→{ safe, reason?, pattern? }.CacheLayer/SemanticCache— for key-based or similarity-based caching.
Errors
RagError— Base (code:RagError).SecurityError— Prompt injection or security check failed (SECURITY_VIOLATION).EmbeddingError— Embedding call failed.VectorStoreError— Vector store operation failed.CacheError— Cache operation failed.ConfigurationError— Invalid or missing config.
Project structure
src/
core/ # Chunker, Retriever, errors
providers/ # OpenAI embedding
adapters/ # In-memory vector store
security/ # Prompt injection filter
cache/ # In-memory + Redis semantic cache
types/ # Interfaces and config types
utils/ # Cosine similarity, metrics, token utils
rag-kit.ts # Main RagKit class
index.ts # Public APILicense
MIT
