npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@aid-on/unillm

v0.5.1

Published

Edge-native unified LLM provider - pure fetch API, minimal dependencies (zod), WebStreams, memory optimization for Cloudflare Workers and edge computing environments

Readme

@aid-on/unillm

npm version TypeScript License: MIT

unillm is a unified LLM interface for edge computing. It provides a consistent, type-safe API across multiple LLM providers with minimal dependencies and optimized memory usage for edge environments.

日本語 | English

Features

  • 🚀 Edge-First: ~50KB bundle size, ~10ms cold start, optimized for edge runtimes
  • 🔄 Unified Interface: Single API for Anthropic, OpenAI, Groq, Gemini, Cloudflare, and more
  • 🌊 Streaming Native: Built on Web Streams API with nagare integration
  • 🎯 Type-Safe: Full TypeScript support with Zod schema validation
  • 📦 Minimal Dependencies: Only Zod (~11KB) required
  • Memory Optimized: Automatic chunking and backpressure handling

Installation

npm install @aid-on/unillm
yarn add @aid-on/unillm
pnpm add @aid-on/unillm

Quick Start

import { unillm } from "@aid-on/unillm";

// Fluent API with type safety
const response = await unillm()
  .model("openai:gpt-4o-mini")
  .credentials({ openaiApiKey: process.env.OPENAI_API_KEY })
  .temperature(0.7)
  .generate("Explain quantum computing in simple terms");

console.log(response.text);

Streaming with nagare

unillm returns @aid-on/nagare Stream<T> for reactive stream processing:

import { unillm } from "@aid-on/unillm";
import type { Stream } from "@aid-on/nagare";

const stream: Stream<string> = await unillm()
  .model("groq:llama-3.3-70b-versatile")
  .credentials({ groqApiKey: "..." })
  .stream("Write a story about AI");

// Use nagare's reactive operators
const enhanced = stream
  .map(chunk => chunk.trim())
  .filter(chunk => chunk.length > 0)
  .throttle(16)  // ~60fps for UI updates
  .tap(chunk => console.log(chunk))
  .toSSE();      // Convert to Server-Sent Events

Structured Output

Generate type-safe structured data with Zod schemas:

import { z } from "zod";

const PersonSchema = z.object({
  name: z.string(),
  age: z.number(),
  skills: z.array(z.string())
});

const result = await unillm()
  .model("groq:llama-3.1-8b-instant")
  .credentials({ groqApiKey: "..." })
  .schema(PersonSchema)
  .generate("Generate a software engineer profile");

// Type-safe access
console.log(result.object.name);     // string
console.log(result.object.skills);   // string[]

Provider Shortcuts

Ultra-concise syntax for common models:

import { anthropic, openai, groq, gemini, cloudflare } from "@aid-on/unillm";

// One-liners for quick prototyping
await anthropic.sonnet("sk-ant-...").generate("Hello");
await openai.mini("sk-...").generate("Hello");
await groq.instant("gsk_...").generate("Hello");
await gemini.flash("AIza...").generate("Hello");
await cloudflare.llama({ accountId: "...", apiToken: "..." }).generate("Hello");

Supported Models (48 Models)

Anthropic (8 models) - v0.4.0

  • anthropic:claude-opus-4-5-20251101 - Claude Opus 4.5 (Most Intelligent)
  • anthropic:claude-haiku-4-5-20251001 - Claude Haiku 4.5 (Ultra Fast)
  • anthropic:claude-sonnet-4-5-20250929 - Claude Sonnet 4.5 (Best for Coding)
  • anthropic:claude-opus-4-1-20250805 - Claude Opus 4.1
  • anthropic:claude-opus-4-20250514 - Claude Opus 4
  • anthropic:claude-sonnet-4-20250514 - Claude Sonnet 4
  • anthropic:claude-3-5-haiku-20241022 - Claude 3.5 Haiku
  • anthropic:claude-3-haiku-20240307 - Claude 3 Haiku

OpenAI (9 models)

  • openai:gpt-4o - GPT-4o (Latest, fastest GPT-4)
  • openai:gpt-4o-mini - GPT-4o Mini (Cost-effective)
  • openai:gpt-4o-2024-11-20 - GPT-4o November snapshot
  • openai:gpt-4o-2024-08-06 - GPT-4o August snapshot
  • openai:gpt-4-turbo - GPT-4 Turbo (High capability)
  • openai:gpt-4-turbo-preview - GPT-4 Turbo Preview
  • openai:gpt-4 - GPT-4 (Original)
  • openai:gpt-3.5-turbo - GPT-3.5 Turbo (Fast & cheap)
  • openai:gpt-3.5-turbo-0125 - GPT-3.5 Turbo Latest

Groq (7 models)

  • groq:llama-3.3-70b-versatile - Llama 3.3 70B Versatile
  • groq:llama-3.1-8b-instant - Llama 3.1 8B Instant
  • groq:meta-llama/llama-guard-4-12b - Llama Guard 4 12B
  • groq:openai/gpt-oss-120b - GPT-OSS 120B
  • groq:openai/gpt-oss-20b - GPT-OSS 20B
  • groq:groq/compound - Groq Compound
  • groq:groq/compound-mini - Groq Compound Mini

Google Gemini (8 models)

  • gemini:gemini-3-pro-preview - Gemini 3 Pro Preview
  • gemini:gemini-3-flash-preview - Gemini 3 Flash Preview
  • gemini:gemini-2.5-pro - Gemini 2.5 Pro
  • gemini:gemini-2.5-flash - Gemini 2.5 Flash
  • gemini:gemini-2.0-flash - Gemini 2.0 Flash
  • gemini:gemini-2.0-flash-lite - Gemini 2.0 Flash Lite
  • gemini:gemini-1.5-pro-002 - Gemini 1.5 Pro 002
  • gemini:gemini-1.5-flash-002 - Gemini 1.5 Flash 002

Cloudflare Workers AI (13 models)

  • cloudflare:@cf/meta/llama-4-scout-17b-16e-instruct - Llama 4 Scout
  • cloudflare:@cf/meta/llama-3.3-70b-instruct-fp8-fast - Llama 3.3 70B FP8
  • cloudflare:@cf/meta/llama-3.1-70b-instruct - Llama 3.1 70B
  • cloudflare:@cf/meta/llama-3.1-8b-instruct-fast - Llama 3.1 8B Fast
  • cloudflare:@cf/meta/llama-3.1-8b-instruct - Llama 3.1 8B
  • cloudflare:@cf/openai/gpt-oss-120b - GPT-OSS 120B
  • cloudflare:@cf/openai/gpt-oss-20b - GPT-OSS 20B
  • cloudflare:@cf/ibm/granite-4.0-h-micro - IBM Granite 4.0
  • cloudflare:@cf/mistralai/mistral-small-3.1-24b-instruct - Mistral Small 3.1
  • cloudflare:@cf/mistralai/mistral-7b-instruct-v0.2 - Mistral 7B
  • cloudflare:@cf/google/gemma-3-12b-it - Gemma 3 12B
  • cloudflare:@cf/qwen/qwq-32b - QwQ 32B
  • cloudflare:@cf/qwen/qwen2.5-coder-32b-instruct - Qwen 2.5 Coder

Advanced Usage

Fluent Builder Pattern

const builder = unillm()
  .model("groq:llama-3.3-70b-versatile")
  .credentials({ groqApiKey: "..." })
  .temperature(0.7)
  .maxTokens(1000)
  .topP(0.9)
  .system("You are a helpful assistant")
  .messages([
    { role: "user", content: "Previous question..." },
    { role: "assistant", content: "Previous answer..." }
  ]);

// Reusable configuration
const response1 = await builder.generate("New question");
const response2 = await builder.stream("Another question");

Memory Optimization

Automatic memory management for edge environments:

import { createMemoryOptimizedStream } from "@aid-on/unillm";

const stream = await createMemoryOptimizedStream(
  largeResponse,
  { 
    maxMemory: 1024 * 1024,  // 1MB limit
    chunkSize: 512           // Optimal chunk size
  }
);

Error Handling

import { UnillmError, RateLimitError } from "@aid-on/unillm";

try {
  const response = await unillm()
    .model("groq:llama-3.3-70b-versatile")
    .credentials({ groqApiKey: "..." })
    .generate("Hello");
} catch (error) {
  if (error instanceof RateLimitError) {
    console.log(`Rate limited. Retry after ${error.retryAfter}ms`);
  } else if (error instanceof UnillmError) {
    console.log(`LLM error: ${error.message}`);
  }
}

Integration Examples

With React

import { useState } from "react";
import { unillm } from "@aid-on/unillm";

export default function ChatComponent() {
  const [response, setResponse] = useState("");
  const [loading, setLoading] = useState(false);
  
  const handleGenerate = async () => {
    setLoading(true);
    const stream = await unillm()
      .model("groq:llama-3.1-8b-instant")
      .credentials({ groqApiKey: import.meta.env.VITE_GROQ_API_KEY })
      .stream("Write a haiku");
    
    for await (const chunk of stream) {
      setResponse(prev => prev + chunk);
    }
    setLoading(false);
  };
  
  return (
    <div>
      <button onClick={handleGenerate} disabled={loading}>
        {loading ? "Generating..." : "Generate"}
      </button>
      <p>{response}</p>
    </div>
  );
}

With Cloudflare Workers

export default {
  async fetch(request: Request, env: Env) {
    const stream = await unillm()
      .model("cloudflare:@cf/meta/llama-3.1-8b-instruct")
      .credentials({
        accountId: env.CF_ACCOUNT_ID,
        apiToken: env.CF_API_TOKEN
      })
      .stream("Hello from the edge!");
    
    return new Response(stream.toReadableStream(), {
      headers: { "Content-Type": "text/event-stream" }
    });
  }
};

API Reference

unillm() Builder Methods

| Method | Description | Example | |--------|-------------|---------| | model(id) | Set the model ID | model("groq:llama-3.3-70b-versatile") | | credentials(creds) | Set API credentials | credentials({ groqApiKey: "..." }) | | temperature(n) | Set temperature (0-1) | temperature(0.7) | | maxTokens(n) | Set max tokens | maxTokens(1000) | | topP(n) | Set top-p sampling | topP(0.9) | | schema(zod) | Set output schema | schema(PersonSchema) | | system(text) | Set system prompt | system("You are...") | | messages(msgs) | Set message history | messages([...]) | | generate(prompt) | Generate response | await generate("Hello") | | stream(prompt) | Stream response | await stream("Hello") |

License

MIT