npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

nai-gen-x

v0.4.0

Published

Queued generation engine for NovelAI Scripts with budget management, exponential backoff, and reactive state

Downloads

69

Readme

nai-gen-x

A queued generation engine for NovelAI Scripts. Wraps the api.v1 scripting API with sequential queue processing, automatic context budgeting, output budget management, exponential backoff for transient errors, and reactive state.

Installation

Method A: Copy-paste (simplest)

Copy src/gen-x.ts directly into your NovelAI Script project.

Method B: npm + nibs

If your project uses nibs or another bundler that resolves node_modules:

npm install nai-gen-x
import { GenX } from "nai-gen-x";

Note: This package distributes raw TypeScript source — no compilation step is needed. Your bundler must support .ts imports.

Quick Start

import { GenX } from "nai-gen-x";

const genx = new GenX({
  onStateChange(state) {
    console.log("State:", state.status);
  },
});

const response = await genx.generate(
  [{ role: "user", content: "Hello!" }],
  { model: "glm-4-6", max_tokens: 200 },
);

Using a MessageFactory (JIT strategy building)

const response = await genx.generate(
  async () => {
    // Called when this task is picked off the queue — not when enqueued
    const context = await buildContext();
    return {
      messages: context.messages,
      params: { max_tokens: context.budget },
      // Pin the first message (system prompt) and last 2 messages (instruction + prefill).
      // Middle messages are automatically trimmed to fit the model's context window.
      contextPinning: { head: 1, tail: 2 },
    };
  },
  { model: "glm-4-6", max_tokens: 200 },
);

Features

Queue Processing

Tasks are enqueued via generate() and processed sequentially. This prevents concurrent API calls and ensures orderly generation.

MessageFactory / JIT Resolution

Pass a function instead of a message array to generate(). The factory is called just-in-time when the task is picked off the queue, enabling deferred strategy building based on the latest state.

Context Budgeting

When a MessageFactory returns contextPinning, GenX automatically trims the message array to fit within the model's context window before sending it to the API. This prevents generation failures on lower-tier subscriptions where api.v1.maxTokens() returns a smaller budget.

How it works: The head and tail counts designate messages that are always kept (e.g., system prompt at the front, instructions + prefill at the back). Everything in the middle is fed through a RolloverHelper that drops the oldest middle messages until the total fits within:

available = maxTokens(model) - tokenCount(head + tail) - max_tokens (output reserve)

If no messages need trimming, the array passes through unchanged. When messages are dropped, GenX logs: [GenX] Trimmed N/M middle messages (budget=..., used=...).

Output Budget Management

Before each generation, GenX checks the platform's output token allowance via api.v1.script.getAllowedOutput. If budget is insufficient, it transitions through waiting_for_user and waiting_for_budget states, then resumes automatically when tokens become available.

Transient Error Retry

Network errors, timeouts, and aborted requests are retried with exponential backoff (2^n seconds) up to maxRetries (default: 5).

Reactive State

Subscribe to state changes to drive UI updates. The state machine broadcasts every transition to registered listeners and hooks.

Cancellation

Cancel individual queued tasks with cancelQueued(taskId), or cancel everything (queued + in-progress) with cancelAll(). Pass a CancellationSignal to generate() for fine-grained control.

API Reference

new GenX(hooks?)

Creates a new GenX instance.

| Parameter | Type | Description | |-----------|------|-------------| | hooks | GenXHooks | Optional lifecycle hooks |

genx.state

Returns a snapshot of the current GenerationState.

genx.subscribe(listener)

const unsubscribe = genx.subscribe((state) => {
  console.log(state.status, state.queueLength);
});

Registers a listener called on every state change. The listener receives an immediate call with the current state. Returns an unsubscribe function.

genx.generate(messages, params, callback?, behaviour?, signal?)

generate(
  messages: Message[] | MessageFactory,
  params: GenerationParams & {
    minTokens?: number;
    maxRetries?: number;
    taskId?: string;
  },
  callback?: (choices: GenerationChoice[], final: boolean) => void,
  behaviour?: "background" | "blocking",
  signal?: CancellationSignal,
): Promise<GenerationResponse>

Enqueues a generation request. Parameters mirror api.v1.generate with additional options:

| Parameter | Type | Description | |-----------|------|-------------| | messages | Message[] \| MessageFactory | Messages array or factory function for JIT resolution | | params | GenerationParams & extras | Generation parameters (see below) | | callback | function | Optional streaming callback | | behaviour | string | "background" or "blocking" | | signal | CancellationSignal | Optional cancellation signal |

Extra params:

| Field | Type | Default | Description | |-------|------|---------|-------------| | minTokens | number | 1 | Minimum tokens required for budget check | | maxRetries | number | 5 | Max transient error retries | | taskId | string | api.v1.uuid() | Custom task identifier |

genx.getTaskStatus(taskId)

Returns "queued", "processing", or "not_found".

genx.cancelQueued(taskId)

Removes a task from the queue. Returns true if the task was found and cancelled or was already absent.

genx.cancelAll()

Cancels all queued tasks and the currently executing task (if any).

genx.userInteraction()

Manually signal a user interaction to transition from waiting_for_user to waiting_for_budget. This is called automatically when the user clicks "Generate" in the NovelAI UI.

Types

GenerationState

interface GenerationState {
  status:
    | "idle"
    | "queued"
    | "generating"
    | "waiting_for_budget"
    | "waiting_for_user"
    | "completed"
    | "failed";
  error?: string;
  queueLength: number;
  budgetWaitEndTime?: number;
}

MessageFactory

type MessageFactory = () => Promise<{
  messages: Message[];
  params?: Partial<GenerationParams>;
  contextPinning?: ContextPinning;
}>;

ContextPinning

type ContextPinning = { head: number; tail: number };

| Field | Type | Description | |-------|------|-------------| | head | number | Number of leading messages to always keep (e.g., system prompt) | | tail | number | Number of trailing messages to always keep (e.g., instruction + prefill) |

Messages between head and tail are the "middle" — these are trimmed oldest-first when the assembled context exceeds the model's token budget. If head + tail >= messages.length, no trimming is performed.

GenXHooks

interface GenXHooks {
  onStateChange?(state: GenerationState): void;
  onTaskStarted?(taskId: string): void;
  beforeGenerate?(taskId: string, messages: Message[]): void;
}

State Machine

         ┌─────────────────────────────┐
         │                             │
         ▼                             │
       idle ──► queued ──► generating ─┤──► completed
                              │        │
                              ▼        │
                      waiting_for_user │
                              │        │
                              ▼        │
                     waiting_for_budget │
                              │        │
                              ▼        │
                          generating ──┘──► failed

License

MIT