memory-runtime

v0.2.0

Published

10 days ago

Stateless context runtime SDK for LLM applications

0High
0Medium
0Low

aarushvilvaray

llm context memory stateless snapshot prompt-engineering openai gemini anthropic

memory-runtime

Stateless context runtime for LLM applications.

memory-runtime is a stateless SDK that replaces the "sliding window" chat history approach with structured state and snapshot-based context management. It reduces prompt sizes by ~80% while preserving high retrieval quality even in large sessions—with zero database dependencies.

Install

npm i memory-runtime

Quick Start

import { createSession } from "memory-runtime";

// Create or load session
let session = createSession({ sessionId: "user-123" });

// Ingest code, docs, or messages
session.ingest({ 
  type: "snippet", 
  payload: { 
    source: "auth.ts", 
    content: "// authentication code...",
    pinned: true // Won't be dropped when buffer fills
  } 
});

session.ingest({
  type: "user_message",
  payload: { content: "How does authentication work?" }
});

// Compile budget-aware prompt
const { messages, snapshot } = session.compile({ 
  userMessage: "Explain the auth flow", 
  budgetTokens: 2000 
});

// Call your LLM provider
const response = await openai.chat.completions.create({ 
  model: "gpt-4o", 
  messages 
});

// Observe response for state extraction
const { snapshot: updatedSnapshot } = session.observe({ 
  assistantText: response.choices[0].message.content 
});

// Store snapshot (your choice: DB, Redis, localStorage, etc.)
await storage.save(userId, updatedSnapshot);

Stateless Architecture

No SQLite. No filesystem. Pure snapshots.

Every compile() and observe() call returns an updated snapshot—a JSON-serializable object containing:

State: Extracted constraints, decisions, open threads, glossary
Artifacts: Code snippets, diffs, doc chunks (with rolling buffer)
Events: Message history (bounded)
Meta: Custom metadata (e.g., KB ingestion tracking)

Your application owns storage. The library never touches the filesystem.

Snapshot Management

Where to Store Snapshots

Server-side: PostgreSQL, Redis, or any database
Client-side: localStorage (compact mode recommended)
Serverless: Pass snapshots in request/response payloads
Encrypted cookies: For small sessions with compact mode

Security Note

⚠️ Snapshots contain user content — treat them as sensitive data. Encrypt at rest and in transit.

Size Optimization

Use compact mode for smaller payloads:

const { snapshot } = session.compile({
  userMessage: "...",
  budgetTokens: 2000,
  returnSnapshot: "compact" // Strips artifact content, keeps only IDs
});

Compact snapshots include:

Full state (constraints, decisions, etc.)
Artifact metadata (IDs, sources, timestamps) — content removed
Last 10 events only

Typical sizes:

Full snapshot: 50-200 KB (depending on artifacts)
Compact snapshot: 5-20 KB (~90% smaller)

KB Ingestion Pattern

Track knowledge base changes via content hashing:

import { createHash } from 'crypto';

const kbHash = createHash('sha256').update(kbContent).digest('hex').slice(0, 16);

// Check if KB needs re-ingestion
if (snapshot.meta?.kbHash !== kbHash) {
  // Ingest KB artifacts
  session.ingest(makeSnippetArtifact(kbContent, 'kb.txt', { pinned: true }));
  
  // Update metadata
  snapshot.meta = { ...snapshot.meta, kbHash, kbIngested: true };
}

API Reference

`createSession(options?)`

Factory for creating sessions:

createSession({
  sessionId?: string,           // Optional ID (auto-generated if omitted)
  snapshot?: Snapshot,          // Load from existing snapshot
  limits?: {
    maxEvents?: number,         // Default: 50
    maxArtifacts?: number       // Default: 50
  }
})

`Session.fromSnapshot(snapshot, limits?)`

DX-friendly alternative:

const session = Session.fromSnapshot(previousSnapshot);

`session.ingest(event)`

Ingest events with compile-time type safety:

// User message
session.ingest({ 
  type: "user_message", 
  payload: { content: string } 
});

// Assistant response
session.ingest({ 
  type: "assistant_response", 
  payload: { content: string } 
});

// Artifacts (code, docs, diffs)
session.ingest({ 
  type: "snippet" | "doc_chunk" | "repo_diff",
  payload: { 
    source: string, 
    content: string, 
    meta?: object,
    pinned?: boolean        // Pinned artifacts survive buffer churn
  } 
});

Helper functions for common artifacts:

import { makeSnippetArtifact, makeRepoDiffArtifact } from "memory-runtime";

const snippet = makeSnippetArtifact(code, "file.ts", { startLine: 10, endLine: 20 });
session.ingest(snippet);

const diff = makeRepoDiffArtifact(gitDiffOutput, { repoPath: "/path/to/repo" });
session.ingest(diff);

`session.compile(options)`

Generate messages within token budget:

const result = session.compile({
  userMessage: string,
  budgetTokens: number,
  stablePrefix?: string,           // Optional system prompt prefix
  returnSnapshot?: "full" | "compact"
});

// Returns:
{
  messages: Message[],              // Ready for LLM API
  debug: {
    includedArtifacts: string[],    // IDs of included artifacts
    droppedArtifacts: string[],     // IDs of dropped artifacts
    tokenEstimate: number,          // Estimated tokens (never exceeds budget)
    rationale: string               // Explanation of selection
  },
  snapshot: Snapshot                // Updated snapshot
}

`session.observe(input)`

Extract structured state from assistant responses:

const result = session.observe({
  assistantText: string,
  returnSnapshot?: "full" | "compact"
});

// Returns:
{
  snapshot: Snapshot  // State updated with extracted constraints/decisions/etc.
}

Extraction markers (optional, for structured updates):

Decision: Use JWT for authentication
Constraint: Tokens must expire after 30 minutes
Open: How should we handle refresh tokens?
Glossary: JWT - JSON Web Token for stateless authentication

`session.exportSnapshot(mode?)`

Export current snapshot:

const fullSnapshot = session.exportSnapshot("full");      // Complete snapshot
const compactSnapshot = session.exportSnapshot("compact"); // Minimal payload

`session.clear()`

Reset state/artifacts, preserve sessionId and meta:

session.clear();

Not Summarization

This is not an LLM-based summarization script. Instead, memory-runtime uses a deterministic compilation engine:

State: Structured records of decisions, constraints, glossary terms
Artifacts: Content-addressed storage for file snapshots and diffs
Budgeting: Deterministic truncation and prioritization that fits your token limit

Determinism guarantee: Same snapshot + same inputs = identical output.

Examples

See examples/stateless-conversation.ts for a complete 3-turn demo showing:

Snapshot serialization (JSON.stringify/parse)
KB ingestion with hash tracking
Compact vs full snapshot modes
State persistence across turns

Run it:

npm run build
tsx examples/stateless-conversation.ts

Testing

# Determinism: same snapshot + inputs = same output
tsx scripts/test-determinism.ts

# Budget enforcement: never exceed budgetTokens
tsx scripts/test-budget.ts

# Pinning: pinned artifacts survive buffer churn
tsx scripts/test-pinning.ts

# Compact mode: verify size reduction
tsx scripts/test-snapshot-size.ts

License: MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

memory-runtime

Install

Quick Start

Stateless Architecture

Snapshot Management

Where to Store Snapshots

Security Note

Size Optimization

KB Ingestion Pattern

API Reference

createSession(options?)

Session.fromSnapshot(snapshot, limits?)

session.ingest(event)

session.compile(options)

session.observe(input)

session.exportSnapshot(mode?)

session.clear()

Not Summarization

Examples

Testing

`createSession(options?)`

`Session.fromSnapshot(snapshot, limits?)`

`session.ingest(event)`

`session.compile(options)`

`session.observe(input)`

`session.exportSnapshot(mode?)`

`session.clear()`