@halo-sdk/rag

v2.0.0

Published

3 days ago

Cache-aware retrieval-augmented generation for Halo AI SDK — sticky retrieval + append-only growth that keeps the prefix cache warm

Downloads

214

0High
0Medium
0Low

maplecity0512

ai embeddings llm prefix-cache rag retrieval

@halo-sdk/rag

Cache-aware retrieval-augmented generation for Halo AI SDK.

Most RAG implementations re-retrieve every turn and splice fresh documents into the prompt — which silently busts the provider's prefix cache on every message. @halo-sdk/rag produces a reusable cache segment with two cache-preserving policies:

Sticky retrieval — near-duplicate consecutive queries skip re-retrieval entirely; the segment (and the cache after it) is untouched.
Append-only growth — when a new result set extends the previous one, new docs are appended rather than rebuilding, so the already-cached prefix stays valid.

Usage

import { Halo } from "@halo-sdk/core";
import { CacheAwareRag, VectorRetriever, MemoryVectorStore } from "@halo-sdk/rag";

const store = new MemoryVectorStore();
// store.add(docs, await embedder.embed(docs.map(d => d.text)))
const retriever = new VectorRetriever(myEmbedder, store);

const rag = new CacheAwareRag({ retriever, k: 4 });
await rag.update("what is prompt caching?");

const agent = halo.agent({
  /* ... */
});
agent.setContextSegments([rag.segment]);

Seams

Embedder, VectorStore, and Retriever are interfaces — bring your own embedding model and vector DB. MemoryVectorStore (cosine) and VectorRetriever are included for tests, demos, and small corpora. Halo owns the cache-aware orchestration, not the embedding/storage.

Pkg
Stats

Discover Tips

General search

Package details

User packages

Sponsor

About

Twitter

GitHub

Twitter

GitHub

Site

Open Software & Tools

Framework

Server

Data Store

Caching

CSS / Styling

Typeface

Avatars

Data Viz

Date formatting

Infinite scrolling

Markdown rendering

Repository url parsing

User data

Compiling

Types

Odds & Ends

@halo-sdk/rag

v2.0.0

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@halo-sdk/rag

Usage

Seams