npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@kyyn/llm-stream

v1.0.0

Published

Smart buffer and pagination engine for piping LLM streams (OpenAI, Anthropic, LangChain…) into Discord messages. Handles rate limits, the 2000-char cap, markdown continuity, and graceful error recovery.

Readme

@kyyn/llm-stream

Smart buffer and pagination engine for streaming LLM responses into Discord messages.

npm version license discord.js peer

Pipe any async LLM stream — OpenAI, Anthropic, LangChain, local models — directly into a Discord message. The library handles every Discord constraint automatically, so you write zero boilerplate.


Problems it solves

| Discord constraint | How @kyyn/llm-stream handles it | |---|---| | ~5 edits per 5 s rate limit | Interval-based flush (default 1 500 ms) — never edits per-token | | 2 000-character message cap | Auto-paginates: spawns followUp / channel.send seamlessly | | Broken markdown on split | Closes ``` on message A, reopens with same language on message B | | Provider lock-in | Accepts any AsyncIterable<string> or Node.js ReadableStream | | Mid-stream message deletion | Catches DiscordAPIError 10008 and aborts gracefully — no crash | | Timer leaks | All setInterval handles are cleared in a finally block |


Installation

npm install @kyyn/llm-stream
# discord.js is a peer dependency — install it if you haven't already
npm install discord.js

Quick start

import { DiscordLLMStreamer } from '@kyyn/llm-stream';

// ── Slash command handler ─────────────────────────────────────────────────────
client.on('interactionCreate', async (interaction) => {
  if (!interaction.isChatInputCommand()) return;

  // 1. Defer so Discord gives us 15 minutes to respond
  await interaction.deferReply();

  // 2. Start your LLM call (OpenAI SDK v4 example)
  const openaiStream = await openai.chat.completions.create({
    model: 'gpt-4o',
    stream: true,
    messages: [{ role: 'user', content: interaction.options.getString('prompt')! }],
  });

  // 3. Wrap the SDK stream in a plain AsyncGenerator<string>
  const textStream = (async function* () {
    for await (const chunk of openaiStream) {
      yield chunk.choices[0]?.delta?.content ?? '';
    }
  })();

  // 4. Hand it to the streamer — that's it
  const streamer = new DiscordLLMStreamer(interaction, {
    editIntervalMs: 1500, // optional, this is the default
    maxLength: 1950,       // optional, this is the default
  });

  await streamer.stream(textStream);
});

API

new DiscordLLMStreamer(target, options?)

| Parameter | Type | Description | |---|---|---| | target | CommandInteraction \| Message | A deferred interaction or any message to reply to | | options.editIntervalMs | number | Milliseconds between Discord edits. Default: 1500 | | options.maxLength | number | Characters before a new message is spawned. Default: 1950 |

streamer.stream(source)

| Parameter | Type | Description | |---|---|---| | source | AsyncIterable<string> \| NodeJS.ReadableStream | Any stream of text tokens |

Returns Promise<void> that resolves once the stream is fully consumed and the final edit has been made.


Provider examples

OpenAI

const raw = await openai.chat.completions.create({ model: 'gpt-4o', stream: true, messages });

await streamer.stream(
  (async function* () {
    for await (const chunk of raw) {
      yield chunk.choices[0]?.delta?.content ?? '';
    }
  })(),
);

Anthropic

const raw = await anthropic.messages.create({ model: 'claude-3-5-sonnet-latest', stream: true, ... });

await streamer.stream(
  (async function* () {
    for await (const event of raw) {
      if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
        yield event.delta.text;
      }
    }
  })(),
);

LangChain

const chain = prompt.pipe(llm);

await streamer.stream(
  (async function* () {
    for await (const chunk of await chain.stream({ question })) {
      yield chunk.content as string;
    }
  })(),
);

Node.js ReadableStream

import { Readable } from 'node:stream';

const readable = Readable.from(['Hello', ' ', 'world']);
await streamer.stream(readable);

Markdown continuity

When a response contains a code block that happens to straddle the 1 950-character boundary, the library automatically preserves formatting:

Message A (finalised at the split):

Here is the implementation:

```javascript
function greet(name) {
  console.log(`Hello, ${name
```

Message B (continuation):

```javascript
}!`);
}
```

The closing ``` is appended to message A, and the opening ```javascript is prepended to message B. Users see seamlessly formatted code across both messages.


How the rate limiter works

Discord allows approximately 5 message edits per 5 seconds per message. Streaming an LLM can produce 20–100 tokens per second, making per-token edits impossible.

DiscordLLMStreamer solves this with a two-layer approach:

LLM stream  ──tokens──▶  pendingBuffer  ──every 1500ms──▶  message.edit()
                               │
                    (if length ≥ 1950)
                               │
                               ▼
                         paginate()  ──▶  followUp() / channel.send()
  1. Token ingestion — every yielded string is appended to an in-memory pendingBuffer. No Discord API calls happen here.
  2. Interval flush — every editIntervalMs the buffer is drained and a single message.edit() is made. 1 500 ms ≈ 0.67 edits/s, well within Discord's limits.
  3. Pagination trigger — if currentContent.length + pendingBuffer.length ≥ maxLength, pagination runs immediately (bypassing the next interval tick).

Building from source

git clone https://github.com/kyyn/llm-stream
cd llm-stream
npm install
npm run build      # outputs to dist/
npm run typecheck  # tsc --noEmit
npm test           # jest

License

MIT © kyyn