@llm-context/core

v0.1.2

Published

2 months ago

Topic-aware context routing engine for LLM conversations. Auto-clusters messages by topic, assembles only relevant context using hybrid retrieval (vector + BM25 + RRF) and LLM judgment.

0High
0Medium
0Low

haloxie

llm context attention routing topic conversation ai

@llm-context/core

Core engine for Lang Context Attention — a topic-aware context routing system for LLM conversations.

What It Does

In multi-turn LLM conversations, users jump between topics. This engine automatically:

Clusters messages by topic using hybrid retrieval (vector + BM25 + RRF fusion)
Routes each message to the correct topic via LLM judgment
Assembles only relevant context with token budget management
Streams responses back with full routing observability

The result: focused LLM responses with ~50% token savings.

Install

pnpm add @llm-context/core

Quick Start

import { createEngine } from '@llm-context/core'

const engine = createEngine({
  store: yourStoreProvider,          // Where to persist data
  vectorSearch: yourVectorSearch,    // Semantic similarity search
  keywordSearch: yourKeywordSearch,  // BM25 keyword search
  chat: yourChatProvider,            // LLM for responses
  judge: yourJudgeProvider,          // LLM for topic classification
  embedding: yourEmbeddingProvider,  // Text → vector embedding
})

// Create a session
const session = await engine.createSession('You are a helpful assistant.')

// Send messages — routing happens automatically
const { stream, routingDecision, rootQuestionId } =
  await engine.processMessage(session.id, 'How do I deploy to AWS?')

for await (const chunk of stream) {
  process.stdout.write(chunk)
}

// The next message is automatically routed to the right topic
const r2 = await engine.processMessage(session.id, 'What about using Docker on AWS?')
// → routed to same topic as above

const r3 = await engine.processMessage(session.id, 'Best chocolate cake recipe?')
// → creates a new topic (unrelated to AWS)

Default Implementations

Use these companion packages for zero-config setup:

| Package | Description | |---------|-------------| | @llm-context/store-sqlite | SQLite storage + sqlite-vec vector search + FTS5 keyword search | | @llm-context/provider-ai-sdk | Vercel AI SDK providers (OpenAI, Anthropic, etc.) |

import { createEngine } from '@llm-context/core'
import { createDatabase, SqliteStore, SqliteVectorSearch, SqliteKeywordSearch } from '@llm-context/store-sqlite'
import { AiSdkChatProvider, AiSdkJudgeProvider, AiSdkEmbeddingProvider } from '@llm-context/provider-ai-sdk'
import { openai } from '@ai-sdk/openai'

const db = createDatabase('./conversations.db')

const engine = createEngine({
  store: new SqliteStore(db),
  vectorSearch: new SqliteVectorSearch(db, 1536),
  keywordSearch: new SqliteKeywordSearch(db),
  chat: new AiSdkChatProvider(openai('gpt-4o-mini')),
  judge: new AiSdkJudgeProvider({ model: openai('gpt-4o-mini') }),
  embedding: new AiSdkEmbeddingProvider({
    model: openai.embedding('text-embedding-3-small'),
    dimensions: 1536,
  }),
})

Engine API

Session Management

engine.createSession(systemPrompt: string, title?: string): Promise<Session>
engine.getSession(sessionId: string): Promise<Session | null>

Message Processing

// Core method — handles the full routing pipeline
engine.processMessage(sessionId: string, userMessage: string): Promise<{
  stream: AsyncIterable<string>     // Streaming LLM response
  routingDecision: RoutingDecision  // Full routing metadata
  rootQuestionId: string            // Which topic this was routed to
}>

Query Methods

engine.getRootQuestions(sessionId): Promise<RootQuestion[]>  // All topics
engine.getMessages(rootQuestionId): Promise<Message[]>       // Messages in a topic
engine.getTimeline(sessionId): Promise<Message[]>            // All messages chronologically
engine.getRoutingDecision(messageId): Promise<RoutingDecision | null>

Manual Operations

engine.reassignMessage(messageId, newTopicId): Promise<void>      // Fix routing errors
engine.linkQuestions(topicA, topicB): Promise<QuestionLink>        // Link related topics
engine.unlinkQuestions(linkId): Promise<void>

Configuration

createEngine({
  // ... providers (required) ...

  topK: 5,                      // Candidates per retrieval (default: 5)
  rrfK: 60,                     // RRF fusion constant (default: 60)
  minFusedScoreForJudge: 0.01,  // Score threshold for judge (default: 0.01)
  maxContextTokens: 4000,       // Token budget for context (default: 4000)
  summaryUpdateInterval: 5,     // Re-summarize every N messages (default: 5)
  summaryContextSize: 10,       // Messages for summary prompt (default: 10)

  // Callbacks
  onRoutingComplete: (decision) => { /* routing telemetry */ },
  onLinkSuggestion: (suggestion) => { /* UI notification */ },
})

Provider Interfaces

Implement these to use your own storage, search, or LLM:

interface StoreProvider { /* Session, RootQuestion, Message, RoutingDecision, QuestionLink CRUD */ }
interface VectorSearchProvider { upsert, search, delete }
interface KeywordSearchProvider { upsert, search, delete }
interface ChatProvider { chat, streamChat }
interface JudgeProvider { judge }
interface EmbeddingProvider { embed, dimensions }

Full interface definitions: interfaces.ts

Routing Flow

User Message → Embed → [Vector Search ∥ Keyword Search] → RRF Fusion → LLM Judge → Context Assembly → Stream Response

See the design spec for full architecture details.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@llm-context/core

What It Does

Install

Quick Start

Default Implementations

Engine API

Session Management

Message Processing

Query Methods

Manual Operations

Configuration

Provider Interfaces

Routing Flow

License