npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@selfagency/llm-stream-parser

v0.3.1

Published

Composable parsers and stream processing utilities for LLM responses

Downloads

521

Readme

@selfagency/llm-stream-parser

Composable parsers and stream processing utilities for LLM responses.

npm CI codecov Codacy Badge License: MIT

Features

  • 🧠 Thinking extraction — Parse and separate <think> reasoning sections from visible output, chunk-by-chunk
  • 🧼 XML stream filtering — Scrub context blocks and privacy tags from streaming output
  • 🛠️ Tool-call extraction — Extract and validate structured XML and native tool invocations
  • 🏛️ Structured output — JSON parsing with schema validation, depth/key limits, and auto-repair
  • 🤖 Agent loops — Multi-step LLM execution with configurable stop conditions and tool handling
  • 🚰 Stream processor — Event-driven orchestrator that composes all parsers in a single pipeline
  • 🔌 Normalizers — Adapters for OpenAI, Anthropic, Gemini, Mistral, Cohere, Ollama, AWS Bedrock, and HF TGI
  • 💻 VS Code integration — ChatResponseStream renderers with thinking progress, tool feedback, and cancellation support
  • 👮‍♂️ Safety by default — Privacy tags are always scrubbed; JSON depth, key counts, and tool-call sizes are bounded

Installation

npm install @selfagency/llm-stream-parser
# or
pnpm add @selfagency/llm-stream-parser
# or
yarn add @selfagency/llm-stream-parser

Requirements: Node.js 18+, TypeScript 5.0+ (if using TypeScript)

Quick Start

import { LLMStreamProcessor } from '@selfagency/llm-stream-parser/processor';

const processor = new LLMStreamProcessor({
  parseThinkTags: true,
  knownTools: new Set(['search', 'edit_file']),
});

processor.on('thinking', delta => process.stdout.write(`[thinking] ${delta}`));
processor.on('text', delta => process.stdout.write(delta));
processor.on('tool_call', call => executeToolCall(call));

for await (const chunk of apiStream) {
  processor.process({
    content: chunk.content,
    done: chunk.done,
    stepIndex: chunk.stepIndex, // optional, useful for multi-step agent loops
    stepUsage: chunk.stepUsage, // optional per-step token usage
  });
}

StreamChunk and ProcessedOutput both support optional stepIndex and stepUsage fields so higher-level agent loops can preserve step-local metadata without custom wrappers.

Modules

@selfagency/llm-stream-parser/thinking — ThinkingParser

Chunk-by-chunk extraction of <think> blocks. Returns [thinkingContent, regularContent] on every call.

import { ThinkingParser } from '@selfagency/llm-stream-parser/thinking';

const parser = new ThinkingParser();

for await (const chunk of llmStream) {
  const [thinking, content] = parser.addContent(chunk);
  if (thinking) showReasoning(thinking);
  if (content) showOutput(content);
}

const [finalThinking, finalContent] = parser.flush();

Automatic tag detection for common models:

const parser = ThinkingParser.forModel('deepseek'); // <think></think>
const parser = ThinkingParser.forModel('granite'); // <|thinking|></|thinking|>

@selfagency/llm-stream-parser/xml-filter — XmlStreamFilter

Stream-safe scrubbing of XML context and privacy blocks.

import { createXmlStreamFilter } from '@selfagency/llm-stream-parser/xml-filter';

const filter = createXmlStreamFilter({ enforcePrivacyTags: true });

for await (const chunk of llmStream) {
  output.write(filter.write(chunk));
}
output.write(filter.end());

Privacy tags are enforced by default (enforcePrivacyTags: true). Pass enforcePrivacyTags: false to opt out explicitly.


@selfagency/llm-stream-parser/context — Context splitting & dedup

import {
  splitLeadingXmlContextBlocks,
  dedupeXmlContextBlocksByTag,
  stripXmlContextTags,
} from '@selfagency/llm-stream-parser/context';

const { contextBlocks, remaining } = splitLeadingXmlContextBlocks(response);
const unique = dedupeXmlContextBlocksByTag(contextBlocks);
const clean = stripXmlContextTags(remaining);

@selfagency/llm-stream-parser/tool-calls — XML tool-call extraction

import { extractXmlToolCalls, buildXmlToolSystemPrompt } from '@selfagency/llm-stream-parser/tool-calls';

// Extract tool calls from a response
const calls = extractXmlToolCalls(response, new Set(['search', 'edit_file']));
for (const call of calls) {
  await executeTool(call.name, call.parameters);
}

// Build the system prompt that teaches the model to emit tool calls
const systemPrompt = buildXmlToolSystemPrompt([
  {
    name: 'search',
    description: 'Search the web',
    inputSchema: { properties: { query: { type: 'string' } }, required: ['query'] },
  },
  { name: 'edit_file', description: 'Edit a file' },
]);

buildXmlToolSystemPrompt throws on invalid tool names; extractXmlToolCalls never throws and silently drops malformed calls.


@selfagency/llm-stream-parser/structured — JSON parsing & validation

import { parseJson, validateJsonSchema } from '@selfagency/llm-stream-parser/structured';

// Tolerant parse — returns null on failure, never throws
const data = parseJson(responseText, { maxJsonDepth: 10, maxJsonKeys: 100 });

// Schema validation — returns discriminated union
const result = validateJsonSchema(responseText, {
  type: 'object',
  properties: { name: { type: 'string' }, age: { type: 'integer' } },
  required: ['name'],
});

if (result.success) {
  console.log(result.data);
} else {
  console.error(result.errors);
}

Additional utilities: buildFormatInstructions, buildRepairPrompt, streamJson, zodToJsonSchema, validateWithZod, repairWithLLM, pipe, buildNativeToolsArray.


@selfagency/llm-stream-parser/agent — Multi-step agent loops

Execute multi-step reasoning loops with automatic tool handling and configurable stopping conditions.

import { createAgentLoop } from '@selfagency/llm-stream-parser/agent';

const agent = createAgentLoop({
  // Call your LLM with current message history
  execute: async function* (messages) {
    const response = await fetch('https://api.example.com/chat', {
      method: 'POST',
      body: JSON.stringify({ messages }),
    });

    for await (const chunk of response.body) {
      yield { content: chunk.toString(), done: false };
    }
  },

  // Stop when thinking is detected (or return false to continue)
  stopWhen: state => state.lastOutput.thinking.length > 0,

  // Build tool results to append to conversation
  buildToolResultMessages: async toolCalls => {
    const results = await Promise.all(
      toolCalls.map(async call => ({
        role: 'user',
        content: `Tool "${call.name}" executed with result: ${JSON.stringify(result)}`,
      })),
    );
    return results;
  },

  // Optional: Called after each step
  onStep: async result => {
    console.log(`Step ${result.output.done ? 'done' : 'in progress'}`);
  },
});

// Run the loop
for await (const part of agent.run([{ role: 'user', content: 'Solve this...' }])) {
  if (part.type === 'text') console.log(part.text);
  if (part.type === 'tool_call') await executeTool(part.call);
}

@selfagency/llm-stream-parser/normalizers — Provider normalizers

Normalize streaming events from different providers into a common StreamChunk shape:

import { normalizeOpenAI } from '@selfagency/llm-stream-parser/normalizers';

for await (const event of openaiStream) {
  const { chunk } = normalizeOpenAI(event);
  if (chunk) processor.process(chunk);
}

Supported: openai, openaiResponses, anthropic, gemini, mistral, cohere, ollama, bedrock, hfTgi.


@selfagency/llm-stream-parser/adapters — High-level adapters

import { createGenericAdapter } from '@selfagency/llm-stream-parser/adapters';

const adapter = createGenericAdapter(
  {
    onContent: text => display(text),
    onThinking: text => displayReasoning(text),
    onToolCall: call => executeToolCall(call),
  },
  { parseThinkTags: true, scrubContextTags: true },
);

await adapter.write(chunk);
await adapter.end();

@selfagency/llm-stream-parser/ui — Event-sourced conversation state

import { LLMStreamProcessor } from '@selfagency/llm-stream-parser/processor';
import { createConversationStoreFromProcessor } from '@selfagency/llm-stream-parser/ui';

const processor = new LLMStreamProcessor({ scrubContextTags: false });
const bridge = createConversationStoreFromProcessor(processor, {
  conversationId: 'conv-1',
});

processor.process({ stepIndex: 0, thinking: 'plan' });
processor.process({ content: 'Hello', done: true, finishReason: 'stop' });

console.log(bridge.store.getState().messages);
bridge.dispose();

The UI package can now consume reducer-friendly processor events automatically, including step lifecycle markers, streaming tool-call updates, and final message usage.


@selfagency/llm-stream-parser/pipeline — Output transforms

import { createSmoothStream, createThinkingFilter } from '@selfagency/llm-stream-parser/pipeline';

const processor = new LLMStreamProcessor({
  transforms: [createThinkingFilter(), createSmoothStream({ chunkSize: 4, delayMs: 25 })],
});

createSmoothStream() splits large text bursts into smaller parts and can optionally pause between sub-chunks with delayMs for steadier UI output.


@selfagency/llm-stream-parser/formatting — Output sanitization

import {
  sanitizeNonStreamingModelOutput,
  formatXmlLikeResponseForDisplay,
} from '@selfagency/llm-stream-parser/formatting';

@selfagency/llm-stream-parser/markdown — Markdown utilities

import { appendToBlockquote } from '@selfagency/llm-stream-parser/markdown';

Renderers

Renderers stream LLM response content to specific output targets (plain text, formatted terminal, browser DOM, VS Code chat). Each renderer owns an internal LLMStreamProcessor and handles thinking blocks, tool calls, step changes, and error callbacks.

All renderers use a factory pattern and implement the same { write(chunk), writeChunk(streamChunk), end() } interface:

import { createPlainTextRenderer } from '@selfagency/llm-stream-parser/renderers/plain';

const renderer = createPlainTextRenderer({
  showThinking: true,
  onError: err => logger.error(err),
  onStep: async (stepIndex, usage) => {
    logger.info(`entered step ${stepIndex}`, usage);
  },
});

await renderer.write('# Response\n');
await renderer.write('Content here');
await renderer.writeChunk({ content: 'Structured content', stepIndex: 1, done: false });
await renderer.end();

Plain Text Renderer

Zero-dependency renderer for CLI/logging. Prefix-based thinking blocks.

import { createPlainTextRenderer } from '@selfagency/llm-stream-parser/renderers/plain';

const renderer = createPlainTextRenderer({
  showThinking: true,
  thinkingPrefix: '[💭] ', // customize thinking block prefix
  output: text => process.stdout.write(text), // optional; defaults to process.stdout
});

await renderer.write(chunk);
await renderer.end();

Thinking Style: prefix (default: [Thinking]). Configure with thinkingPrefix option.


CLI Markdown Renderer

Terminal-formatted markdown with blockquote thinking blocks. Requires peer dependency: npm install cli-markdown

import { createCliRenderer } from '@selfagency/llm-stream-parser/renderers/cli';

const renderer = createCliRenderer({
  showThinking: true,
  thinkingStyle: 'blockquote', // or 'suppress'
  output: text => process.stdout.write(text),
});

await renderer.write(chunk);
await renderer.end();

Thinking Styles:

  • blockquote (default): Render thinking as > **💭 Thinking:** ... markdown blockquote
  • suppress: Hide thinking blocks entirely

Streaming Markdown Renderer

Browser-based DOM rendering with incremental updates and security sanitization. Requires peer dependencies: npm install streaming-markdown dompurify

import { createStreamingMarkdownRenderer } from '@selfagency/llm-stream-parser/renderers/streaming-md';

const target = document.getElementById('response');
const renderer = createStreamingMarkdownRenderer({
  target,
  showThinking: true,
  thinkingContainer: document.getElementById('thinking'), // optional separate container
  onSecurityViolation: () => console.warn('XSS attempt blocked'),
});

await renderer.write(chunk);
await renderer.end();

Thinking Style: blockquote (default) or inline. Separate container if provided.


VS Code Chat Renderer

Integration with VS Code's ChatResponseStream for Copilot extensions. Stream LLM responses directly to VS Code's Chat interface with built-in support for thinking blocks, tool invocations, and token usage. No external dependencies required.

Features:

  • Automatic thinking block rendering (blockquote or progress indicator)
  • Tool invocation callbacks for real-time feedback
  • Token usage reporting
  • CancellationToken support via cancellationTokenToAbortSignal()
import { createVSCodeChatRenderer } from '@selfagency/llm-stream-parser/renderers/vscode';

const renderer = createVSCodeChatRenderer({
  stream, // VS Code ChatResponseStream
  showThinking: true,
  thinkingStyle: 'blockquote', // 'blockquote' | 'progress' | 'suppress'
});

// Stream chunks from your LLM
for await (const chunk of llmStream) {
  await renderer.writeChunk(chunk);
}

await renderer.end();

For agent loops:

import { createVSCodeAgentLoop } from '@selfagency/llm-stream-parser/renderers/vscode';

const renderer = createVSCodeAgentLoop({
  stream,
  thinkingStyle: 'blockquote', // Thinking enabled by default for agent reasoning
});

Thinking Styles:

  • blockquote (default): Render as > **💭 Thinking:** ... markdown
  • progress: Send thinking via stream.progress() for VS Code progress indicator

Tool calls fire the onToolCall callback but are not rendered as content.

When a renderer receives structured chunks via writeChunk(), it can also call onStep(stepIndex, usage) as step metadata changes.

Ink Terminal Renderer

Beautiful, themeable terminal output for CLI/TUI applications built on React/Ink. Requires peer dependencies: npm install ink react

import { createInkRenderer } from '@selfagency/llm-stream-parser/renderers/ink';

const renderer = await createInkRenderer({
  processor, // LLMStreamProcessor instance
  showThinking: true,
  thinkingStyle: 'blockquote', // 'blockquote' | 'inline' | 'suppress'
  showToolCalls: true,
  markdown: true,
  syntaxHighlight: true, // Code fence highlighting (requires cli-highlight)
  theme: 'catppuccin-mocha', // See available themes below
  screenReader: false, // Accessibility mode
  keyboard: {
    enabled: true,
    onInterrupt: () => process.exit(0),
    onCancel: () => renderer.end(),
  },
  onWarning: msg => console.warn(msg),
  onFinish: () => console.log('Stream complete'),
});

// Stream chunks via processor events
processor.on('text', delta => renderer.write(delta));
processor.on('done', () => renderer.end());

renderer.unmount(); // Cleanup when done

Available Themes:

  • default, dark, light, minimal — Basic themes
  • dracula — Dark purple/cyan theme
  • catppuccin-mocha, catppuccin-latte, catppuccin-macchiato, catppuccin-frappe — Pastel Catppuccin palette
  • ayu-mirage — Dark gray/amber theme
  • houston — Astro's dark blue/mint theme
  • one-dark — Classic Atom One Dark theme
  • one-candy — One Dark with pastel candy accents
  • github-dark — GitHub Primer dark theme

Custom Themes:

import type { Theme } from '@selfagency/llm-stream-parser/renderers/ink';

const customTheme: Theme = {
  thinking: { borderColor: 'magenta', textColor: 'magenta', spinnerColor: 'magenta' },
  toolCall: { pendingColor: 'yellow', doneColor: 'green', pendingSymbol: '?', doneSymbol: '✓' },
  text: { cursorSymbol: '|', dimColor: false },
  border: { style: 'round', color: 'gray' },
  highlight: { theme: 'monokai' },
};

const renderer = await createInkRenderer({ theme: customTheme, ... });

Thinking Styles:

  • blockquote (default): Render as bordered block with spinner
  • inline: Render as italic inline text with prefix
  • suppress: Hide thinking blocks entirely

Accessibility:

Set screenReader: true to disable animations and output plain text for screen reader users.


Error Handling

| Category | Behaviour | | -------- | --------- |