kolm-search

v1.0.0

Published

3 days ago

Headless RAG search orchestration for your existing stack. Build production-grade search pipelines using your own databases and LLM providers. Zero vendor lock-in, edge-ready, and fully type-safe.

0High
0Medium
0Low

jellek87

rag search hybrid-search vector-search fulltext-search retrieval-augmented-generation cloudflare-workers postgresql typescript rrf pipeline

kolm-search

Headless RAG search orchestration for your existing stack. Build production-grade search pipelines using your own databases and LLM providers. Zero vendor lock-in, edge-ready, and fully type-safe.

Overview

kolm-search is a TypeScript library for building professional search and Retrieval-Augmented Generation (RAG) pipelines.

It is designed as a headless search engine, providing the orchestration logic—query expansion, hybrid retrieval (vector + fulltext), reranking, and LLM synthesis—while letting you keep your data where it lives. Designed for developers who need complete control over their retrieval logic, kolm-search offers a transparent, minimal-dependency alternative to larger, generalized AI frameworks.

✅ Full RAG Pipeline: Hybrid search, Reciprocal Rank Fusion (RRF), and query normalization out-of-the-box.
✅ Zero Vendor Lock-in: Bring your own database (Postgres, D1, etc.) and your own LLM provider (OpenAI, Anthropic, Workers AI).
✅ Edge-Ready Performance: Designed for high-performance Node.js, Bun, and Edge environments (Cloudflare Workers).
✅ Strictly Type-Safe: Built with TypeScript and Standard Schema V1 support for validated inputs/outputs.

Documentation

Installation

# pnpm
pnpm add kolm-search

# npm
npm install kolm-search

# yarn
yarn add kolm-search

# bun
bun add kolm-search

Quick Start (Presets)

Presets are pre-configured search clients for specific environments.

1. In-Memory (Dev/Testing)

import { createBasicSearchClient } from "kolm-search/presets/basic";

const client = createBasicSearchClient([
  { id: "1", title: "Setup", content: "Install via pnpm add kolm-search" },
]);

const { results } = await client.search({ query: "how to install" });

2. Cloudflare (Workers + Vectorize + D1)

import { createCloudflareSearchClient } from "kolm-search/presets/cloudflare";

export default {
  async fetch(request, env) {
    const client = createCloudflareSearchClient(env, { d1Table: "docs_fts" });
    const response = await client.search({ query: "hybrid search", mode: "hybrid" });
    return Response.json(response);
  },
};

Example: Custom RAG Implementation

kolm-search is adapter-driven. You can mix and match providers for different stages of the lifecycle by implementing simple interfaces.

import { SearchClient, DefaultQueryPlanner } from "kolm-search";
import { createFulltextRetriever } from "kolm-search/adapters/generic";

const client = new SearchClient({
  // Required: normalize and plan queries
  planner: new DefaultQueryPlanner(),

  // 1. Convert text to vector
  embedder: {
    async embed(text) {
      const res = await openai.embeddings.create({ model: "text-embedding-3-small", input: text });
      return res.data[0].embedding;
    }
  },

  // 2. Fetch documents from your own database
  retriever: createFulltextRetriever({
    async search(query, limit) {
      return db.query("SELECT id, content FROM docs WHERE text @@ to_tsquery($1) LIMIT $2", [query, limit]);
    },
    toDocument: (row) => ({ id: row.id, content: row.content })
  }),

  // 3. (Optional) Synthesize an LLM answer
  synthesizer: {
    async synthesize(context) {
      const text = context.results.map(r => r.content).join("\n");
      const res = await openai.chat.completions.create({
        model: "gpt-4o",
        messages: [{ role: "system", content: `Context: ${text}` }, { role: "user", content: context.plan.normalizedQuery }]
      });
      return res.choices[0].message.content;
    }
  }
});

Features

Hybrid Search: Combine Fulltext (keyword) and Vector (semantic) search with Reciprocal Rank Fusion for superior relevance.
Smart Query Planning: Multi-query expansion and intent classification to understand what users actually want.
Production Orchestration: Built-in parallel retrieval, response caching with TTLs, and deduplication.
Framework Agnostic: Runs anywhere (Node, Bun, Edge, Deno). No heavy dependencies.
Observability: Detailed stage-level telemetry (track duration and success per-stage).
Schema Validation: Native support for Zod, Valibot, and ArkType via Standard Schema.

Architecture: The Headless Engine

kolm-search follows a modular port-and-adapter architecture. You provide the adapters (how to talk to your DB/LLM), and the library handles the pipeline (the logic of how a search should run).

Request ──▶ [ Cache (hit?) ] ──▶ [ QueryPlanner ] ──▶ [ IntentClassifier ]
                                                              │
    ┌─────────────────────────────────────────────────────────┘
    ▼
[ Embedder ] ──▶ [ Retriever (Vector/Fulltext/Hybrid) ] ──▶ [ Deduplicator ]
                                                              │
    ┌─────────────────────────────────────────────────────────┘
    ▼
[ Reranker ] ──▶ [ Pagination ] ──▶ [ Synthesizer (LLM) ] ──▶ [ Cache (set) ] ──▶ SearchResponse

Advanced Configuration

Error Handling

Two error types are thrown by the pipeline:

SearchError — a named pipeline stage failed (query validation, retrieval, embedding, etc.)
SchemaValidationError — input or output failed Standard Schema validation

Identify exactly which stage failed using the PIPELINE_STAGES constants:

import { SearchError, SchemaValidationError, PIPELINE_STAGES } from "kolm-search";

try {
  await client.search({ query: "..." });
} catch (error) {
  if (error instanceof SearchError && error.stage === PIPELINE_STAGES.EMBEDDER) {
    console.error("The LLM embedding service is down.");
  }
  if (error instanceof SchemaValidationError) {
    console.error("Validation failed on", error.target, error.issues);
  }
}

Telemetry

Monitor your search performance with millisecond precision per-stage.

const telemetry: Telemetry = {
  async track(event, payload) {
    // event is always "search.completed"
    // payload shape: { durationMs: number, mode: SearchMode, resultCount: number, stageDurations: Record<string, number> }
    // stageDurations keys: "planner", "embedder", "retriever", "reranker", "synthesizer", "cache.get", "cache.set"
    console.log(`Search completed in ${payload.durationMs}ms`, payload.stageDurations);
  },
};

Contributing

Contributions are welcome. Please open an issue before submitting a pull request for significant changes.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

kolm-search

Overview

Documentation

Installation

Quick Start (Presets)

1. In-Memory (Dev/Testing)

2. Cloudflare (Workers + Vectorize + D1)

Example: Custom RAG Implementation

Features

Architecture: The Headless Engine

Advanced Configuration

Error Handling

Telemetry

Contributing

License