alphaloop

v0.1.0

Published

4 months ago

Turn any embeddings dataset into an agentic retrieval loop with query expansion, LLM re-ranking, and iterative refinement

0High
0Medium
0Low

rprend

embeddings rag retrieval search agentic llm re-ranking query-expansion ai-sdk cloudflare-workers

alphaloop

Turn any embeddings dataset into an agentic retrieval loop. Drop-in NPM package that adds query expansion, LLM re-ranking, and iterative refinement to your existing vector search.

Works with any AI SDK model provider (OpenAI, Anthropic, Google, Cloudflare Workers AI) and any vector database.

Install

npm install alphaloop ai zod

Quick Start

1. Create the handler (backend)

import { createAlphaloopHandler } from "alphaloop/handler";
import { openai } from "@ai-sdk/openai";

const handler = createAlphaloopHandler({
  model: openai("gpt-4o"),
  search: async (query, { topK }) => {
    // Your vector search — embed the query and search your database
    const results = await myVectorDB.search(query, topK);
    return results.map((r) => ({
      id: r.id,
      text: r.text,
      score: r.score,
      metadata: r.metadata,
    }));
  },
});

// Cloudflare Worker
export default { fetch: handler };

// Or Next.js API route
export const POST = handler;

// Or Express
app.post("/api/chat", (req, res) => handler(req).then((r) => r));

2. Connect the frontend

import { useChat } from "@ai-sdk/react";
import { DefaultChatTransport } from "ai";
import { SearchProgress, Citations } from "alphaloop/react";

const transport = new DefaultChatTransport({ api: "/api/chat" });

function Search() {
  const { messages, sendMessage, status } = useChat({ transport });
  // Render messages, progress, and citations
}

That's it. The handler runs the full agentic loop and streams results.

How It Works

Alphaloop runs a 4-step retrieval loop over your embeddings:

1. Embedding Search (recall baseline)

Calls your search function with the original query. Retrieves the top N chunks as a starting point.

2. Query Expansion

The LLM generates diverse query variants — synonyms, rephrasings, related concepts, more specific/abstract formulations. All variants are searched in parallel and results are deduplicated. This alone massively improves recall.

For example, a query about "grief" might expand to:

"coping with loss and bereavement"
"emotional response to death of a loved one"
"processing absence and longing"
"psychological stages of mourning"

3. LLM Re-ranking

All collected chunks are sent to the LLM for relevance scoring (0-1). The LLM reads each passage and evaluates how relevant it is to the original query. This catches implicit and subtle matches that embedding similarity misses — a passage about Buddhist impermanence might score highly for a grief query even though the word "grief" never appears.

4. Iterative Refinement

Top-ranked passages are fed back to the LLM to generate NEW search queries based on discovered concepts. This is concept expansion — "I found passages about attachment and impermanence, what else should I search for?" The loop repeats up to maxIterations times or until no new relevant chunks are found.

5. Classification (optional)

An optional step that classifies remaining unranked chunks against abstract concepts, catching things even the re-ranker might miss.

Configuration

createAlphaloopHandler({
  // Required
  model: openai("gpt-4o"),       // Any AI SDK LanguageModel
  search: mySearchFunction,       // Your embedding search

  // Optional
  rerankModel: openai("gpt-4o-mini"), // Cheaper model for re-ranking
  initialTopK: 200,              // Chunks per search call (default: 200)
  maxExpandedQueries: 8,         // Query variants per round (default: 8)
  maxIterations: 3,              // Refinement rounds (default: 3)
  relevanceThreshold: 0.3,       // Min relevance score 0-1 (default: 0.3)
  enableClassifier: false,       // Enable classifier step (default: false)
  systemPrompt: "...",           // Custom system prompt
  additionalTools: { ... },      // Extra AI SDK tools
  maxToolSteps: 5,               // Max tool call steps (default: 5)
});

Search Function

Your search function takes a string query and returns chunks. You handle embedding internally — alphaloop only generates query strings.

type EmbeddingSearchFn = (
  query: string,
  options: { topK: number },
) => Promise<EmbeddingChunk[]>;

interface EmbeddingChunk {
  id: string;
  text: string;
  score: number;
  metadata?: Record<string, unknown>;
}

Works with any vector database: Cloudflare Vectorize, Pinecone, Weaviate, pgvector, in-memory, etc.

React Components

Optional UI components for displaying search progress and citations. Import from alphaloop/react.

SearchProgress

Shows streaming progress as the loop runs:

import { SearchProgress } from "alphaloop/react";

<SearchProgress events={progressEvents} isRunning={true} />

Citations

Expandable source citations with quote-style previews:

import { Citations } from "alphaloop/react";

<Citations
  chunks={citations}
  getSourceUrl={(chunk) => `/docs/${chunk.id}`} // Optional: make source IDs clickable links
/>

Programmatic Usage

Use the core API directly without the handler:

import { createAlphaloop } from "alphaloop";
import { openai } from "@ai-sdk/openai";

const loop = createAlphaloop({
  model: openai("gpt-4o"),
  search: mySearchFn,
});

// Run the full loop
const result = await loop.run("What is consciousness?");
console.log(result.chunks);     // Ranked results
console.log(result.iterations); // Loop telemetry

// Or use as AI SDK tools
const tools = loop.tools();
// tools.deep_search — use with streamText()

Streaming Progress

The handler streams progress events during tool execution using the AI SDK's createUIMessageStream. Register the dataPartSchemas in your useChat call to receive them:

import { useChat } from "@ai-sdk/react";
import { jsonSchema } from "ai";

const { messages } = useChat({
  transport,
  dataPartSchemas: {
    "search-progress": {
      schema: jsonSchema({ type: "object", properties: { type: { type: "string" } } }),
    },
  },
});

// Progress events appear as message parts with type "data-search-progress"

Cloudflare Workers

Alphaloop uses Web Standards only (ReadableStream, fetch, TextEncoder) — no Node.js APIs. It works on Cloudflare Workers out of the box.

// wrangler.toml — no special flags needed
// The loop is mostly network calls (LLM API, vector DB)
// which don't count toward the CPU time limit.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

alphaloop

Install

Quick Start

1. Create the handler (backend)

2. Connect the frontend

How It Works

1. Embedding Search (recall baseline)

2. Query Expansion

3. LLM Re-ranking

4. Iterative Refinement

5. Classification (optional)

Configuration

Search Function

React Components

SearchProgress

Citations

Programmatic Usage

Streaming Progress

Cloudflare Workers

License