@tamerxz/llm-stream
v0.1.0
Published
Zero-dependency, provider-agnostic SSE parser for LLM streaming responses. Browser + Node.js. TypeScript-first. Under 3KB minified+gzipped.
Downloads
41
Maintainers
Readme
llm-stream
Zero-dependency, provider-agnostic SSE parser for LLM streaming responses. Browser + Node.js. TypeScript-first. Under 3KB minified+gzipped.
llm-stream turns the raw SSE byte stream from any LLM provider into a unified, typed event stream — text deltas, tool calls, thinking blocks, finish reasons, usage — without bundling an HTTP client or framework.
Why
Every developer writes the same boilerplate to consume LLM streams:
| Step | Raw fetch | llm-stream |
|---|---|---|
| Open stream | ~5 lines | ✨ already done |
| Decode UTF-8 | ~3 lines | ✨ |
| Split on data: | ~10 lines | ✨ |
| Parse JSON | ~3 lines | ✨ |
| Extract deltas | ~15 lines | ✨ |
| Handle tool calls | ~20 lines | ✨ |
| Handle each provider differently | ×N | one API |
| Total | ~60 lines/provider | 3 lines |
Existing options are either heavy (Vercel AI SDK ≈ 50KB+ with framework deps), generic (eventsource-parser doesn't know LLM semantics), or single-provider.
Installation
npm i @tamerxz/llm-streamNo dependencies. Works in Node.js 18+, Deno, Bun, Cloudflare Workers, and modern browsers.
Quickstart
OpenAI
import { parseStream } from "@tamerxz/llm-stream";
const response = await fetch("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,
},
body: JSON.stringify({
model: "gpt-4o",
stream: true,
messages: [{ role: "user", content: "Hello!" }],
}),
});
for await (const event of parseStream(response, { provider: "openai" })) {
if (event.type === "text") process.stdout.write(event.delta);
else if (event.type === "done") console.log("\n", event.usage);
}Anthropic
import { parseStream } from "@tamerxz/llm-stream";
const response = await fetch("https://api.anthropic.com/v1/messages", {
method: "POST",
headers: {
"Content-Type": "application/json",
"x-api-key": process.env.ANTHROPIC_API_KEY!,
"anthropic-version": "2023-06-01",
},
body: JSON.stringify({
model: "claude-opus-4-7",
max_tokens: 1024,
stream: true,
messages: [{ role: "user", content: "Hello!" }],
}),
});
for await (const event of parseStream(response, { provider: "anthropic" })) {
if (event.type === "text") process.stdout.write(event.delta);
else if (event.type === "thinking") console.error("[thinking]", event.delta);
else if (event.type === "tool_use_end") console.log("tool:", event.input);
else if (event.type === "done") console.log("\n", event.reason, event.usage);
}Auto-detect provider
for await (const event of parseStream(response, { provider: "auto" })) {
// library inspects the first chunk and picks the right parser
}Callback style
await parseStream(response, {
provider: "openai",
onText: ({ delta }) => process.stdout.write(delta),
onToolUse: (event) => {
if (event.type === "tool_use_end") console.log("tool:", event.input);
},
onDone: ({ reason, usage }) => console.log(reason, usage),
});When any callback is provided, parseStream returns a thenable that resolves on stream completion. Mix both styles as you see fit.
Abort
const controller = new AbortController();
setTimeout(() => controller.abort(), 5_000);
for await (const event of parseStream(response, {
provider: "openai",
signal: controller.signal,
})) {
// clean done event with reason: 'error' on abort
}Event types
All events are members of a discriminated union with a literal type field.
type StreamEvent =
| { type: "text"; delta: string; cumulative: string }
| { type: "tool_use_start"; id: string; name: string }
| { type: "tool_use_delta"; id: string; delta: string }
| { type: "tool_use_end"; id: string; input: unknown }
| { type: "thinking"; delta: string; cumulative: string }
| { type: "citation"; text: string; source: string }
| { type: "error"; error: Error; recoverable: boolean }
| {
type: "done";
reason: "stop" | "length" | "tool_use" | "content_filter" | "error";
usage?: { input_tokens: number; output_tokens: number };
};The cumulative field on text / thinking is computed by the library — you don't need to maintain your own buffer.
tool_use_end.input is the fully-parsed JSON for the tool call. The library accumulates tool_use_delta fragments per tool-call id and parses them when the call completes. If parsing fails, you get a recoverable error event and iteration continues.
Provider support
| Provider | Status (0.1.0) | Text | Tool use | Thinking | Citations | |-----------|----------------|------|----------|----------|-----------| | OpenAI | ✅ shipped | ✅ | ✅ | ✅ (o1) | — | | Anthropic | ✅ shipped | ✅ | ✅ | ✅ | ✅ | | Google | planned 0.2.0 | — | — | — | — | | Mistral | planned 0.3.0 | — | — | — | — | | Cohere | planned 0.3.0 | — | — | — | — | | xAI | planned 0.3.0 | — | — | — | — |
Auto-detection in 0.1.0 picks between OpenAI and Anthropic based on the first chunk.
Bundle size
npm run sizeThe published ESM bundle is under 3KB minified + gzipped. Verified in CI; the build fails if it regresses past the limit.
Error handling
- Network or runtime errors propagate as
errorevents withrecoverable: false, followed by adoneevent withreason: "error". Iteration ends. - Malformed JSON inside an SSE payload emits an
errorevent withrecoverable: true. Iteration continues. - Aborting the
AbortControllerends iteration with{ type: "done", reason: "error" }.
The library never throws synchronously from the iterator — every failure flows through events.
License
MIT — see LICENSE.
Contributing
PRs welcome. See CONTRIBUTING.md for guidelines on adding providers and capturing fixtures.
