npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

ai-media-kit

v0.2.2

Published

AI Media Generation SDK — unified interface for Kling, Sora, Veo, Gemini and more

Readme

ai-media-kit

Unified TypeScript SDK for AI media generation APIs — video, image, and audio.

Highlights

  • Multi-provider — Kling, Seedance, Seedance 2, xAI (Grok), Suno, Gemini, and growing
  • Native request types — Full TypeScript interfaces per endpoint with JSDoc; no dumbed-down common denominator
  • Unified response — Consistent task.status, task.outputs, and task.raw across all providers
  • Async & sync — Async task model with polling for video/audio; direct results for sync APIs (Gemini)
  • SSE streaming — Built-in task.toSSEResponse() for browser streaming
  • Zero runtime dependencies

Install

npm install ai-media-kit
# or
bun add ai-media-kit
# or
pnpm add ai-media-kit

Providers

| Provider | Factory | Endpoints | Media | Model | | :--- | :--- | :--- | :--- | :--- | | Seedance 2 | createSeedance2() | generate | Video | Async (Task) | | Seedance | createSeedance() | text2video, image2video, draft2video, generate | Video | Async (Task) | | Kling | createKling() | text2video, image2video, multiImage2video | Video | Async (Task) | | xAI (Grok) | createXai() | text2video, image2video, referenceVideo, generate, edit, extend | Video | Async (Task) | | Suno | createSuno() | inspiration, customLyrics, continue, cover, persona, sound, generate + upload/concat/createPersona | Audio | Async (Task) | | Gemini | createGemini() | generateContent | Image | Sync (direct) |


Seedance 2

Seedance 2.0 (doubao-seedance-2-0) — latest Volcengine video generation model. Supports text-to-video, image-to-video (first frame, last frame, reference image), reference video, reference audio, and audio generation.

import { createSeedance2 } from "ai-media-kit/providers/seedance2";

const sd2 = createSeedance2({
  baseUrl: "https://your-gateway.com/sd2",
  apiKey: process.env.SD2_API_KEY,
});

SD2: Text to Video

const task = await sd2.generate.create({
  model: "doubao-seedance-2-0-260128",
  content: [{ type: "text", text: "a cat yawning lazily in the sun" }],
  generate_audio: true,
  ratio: "16:9",
  resolution: "720p",
  duration: 5,
});

await task.wait();
console.log(task.outputs); // [{ type: "video", url: "https://..." }]

SD2: Image to Video (First Frame)

const task = await sd2.generate.create({
  model: "doubao-seedance-2-0-260128",
  content: [
    { type: "image_url", image_url: { url: "https://example.com/cat.jpg" }, role: "first_frame" },
    { type: "text", text: "the cat starts walking towards the camera" },
  ],
  generate_audio: true,
  ratio: "16:9",
  resolution: "720p",
  duration: 5,
});

SD2: Image to Video (First + Last Frame)

const task = await sd2.generate.create({
  model: "doubao-seedance-2-0-260128",
  content: [
    { type: "image_url", image_url: { url: "https://example.com/start.jpg" }, role: "first_frame" },
    { type: "image_url", image_url: { url: "https://example.com/end.jpg" }, role: "last_frame" },
    { type: "text", text: "smooth transition between two scenes" },
  ],
  generate_audio: true,
  ratio: "16:9",
  resolution: "720p",
  duration: 5,
});

SD2: Reference Image

Use 1–4 reference images to influence the visual style without locking the first frame.

const task = await sd2.generate.create({
  model: "doubao-seedance-2-0-260128",
  content: [
    { type: "image_url", image_url: { url: "https://example.com/ref1.jpg" }, role: "reference_image" },
    { type: "image_url", image_url: { url: "https://example.com/ref2.jpg" }, role: "reference_image" },
    { type: "text", text: "a stylish product showcase in the same visual style" },
  ],
  generate_audio: true,
  ratio: "16:9",
  resolution: "720p",
  duration: 5,
});

SD2: Reference Video

Provide a reference video to guide motion and style.

const task = await sd2.generate.create({
  model: "doubao-seedance-2-0-260128",
  content: [
    { type: "video_url", video_url: { url: "https://example.com/reference.mp4" }, role: "reference_video" },
    { type: "text", text: "recreate this motion with a different character" },
  ],
  generate_audio: true,
  ratio: "16:9",
  resolution: "720p",
  duration: 5,
});

SD2: Reference Audio

Provide a reference audio to sync the generated video.

const task = await sd2.generate.create({
  model: "doubao-seedance-2-0-260128",
  content: [
    { type: "audio_url", audio_url: { url: "https://example.com/narration.mp3" }, role: "reference_audio" },
    { type: "text", text: "a person speaking to the camera in a studio" },
  ],
  generate_audio: true,
  ratio: "16:9",
  resolution: "720p",
  duration: 10,
});

SD2: Combining Multiple Content Types

const task = await sd2.generate.create({
  model: "doubao-seedance-2-0-260128",
  content: [
    { type: "image_url", image_url: { url: "https://example.com/scene.jpg" }, role: "first_frame" },
    { type: "audio_url", audio_url: { url: "https://example.com/bgm.mp3" }, role: "reference_audio" },
    { type: "text", text: "the scene slowly zooms in as music plays" },
  ],
  generate_audio: true,
  ratio: "16:9",
  resolution: "720p",
  duration: 10,
});

SD2: SSE Streaming

// Next.js / Hono / any Web-standard handler
const task = await sd2.generate.create({ ... });
return task.toSSEResponse();

SD2: Resume from Task ID

const task = sd2.generate.fromId("task_xxxx");
await task.refresh(); // single status check

SD2: Parameters

| Parameter | Type | Default | Description | | :--- | :--- | :--- | :--- | | model | string | — | "doubao-seedance-2-0-260128" or "doubao-seedance-1-5-pro-251215" | | content | array | — | Multimodal array: text, image_url, video_url, audio_url | | generate_audio | boolean | true | Generate synchronized audio | | ratio | string | "16:9" | "16:9" "9:16" "1:1" "4:3" "3:4" "21:9" "adaptive" | | resolution | string | "720p" | "480p" or "720p" | | duration | number | 5 | Video duration in seconds (5–15) |


Seedance (Official Volcengine API)

For the official Volcengine Seedance API. Supports all Seedance 1.x models and draft mode.

import { createSeedance } from "ai-media-kit/providers/seedance";

const seedance = createSeedance({
  baseUrl: "https://ark.cn-beijing.volces.com/api/v3",
  apiKey: process.env.SEEDANCE_API_KEY,
});

Seedance: Text to Video

const task = await seedance.text2video.create({
  model: "doubao-seedance-1-5-pro-251215",
  content: [{ type: "text", text: "a cat yawning lazily in the sun" }],
});
await task.wait();
console.log(task.outputs); // [{ type: "video", url: "https://..." }]

Seedance: Image to Video (First Frame)

const task = await seedance.image2video.create({
  model: "doubao-seedance-1-0-pro-250428",
  content: [
    { type: "image_url", image_url: { url: "https://example.com/cat.jpg" } },
    { type: "text", text: "the cat starts walking" },
  ],
});

Seedance: Image to Video (First + Last Frame)

const task = await seedance.image2video.create({
  model: "doubao-seedance-1-0-pro-250428",
  content: [
    { type: "image_url", image_url: { url: "https://example.com/start.jpg" }, role: "first_frame" },
    { type: "image_url", image_url: { url: "https://example.com/end.jpg" }, role: "last_frame" },
    { type: "text", text: "smooth camera dolly from start to end" },
  ],
});

Seedance: Reference Image (1–4 images)

const task = await seedance.image2video.create({
  model: "doubao-seedance-1-0-lite-i2v-250428",
  content: [
    { type: "image_url", image_url: { url: "https://example.com/ref.jpg" }, role: "reference_image" },
    { type: "text", text: "character walks through a park" },
  ],
});

Seedance: Draft Mode (1.5 Pro)

Generate a fast, low-cost draft preview (480p), then promote to full quality.

// Step 1: Draft
const draft = await seedance.text2video.create({
  model: "doubao-seedance-1-5-pro-251215",
  content: [{ type: "text", text: "epic drone shot over a mountain" }],
  draft: true,
});
await draft.wait();

// Step 2: Promote draft to full quality
const final = await seedance.draft2video.create({
  model: "doubao-seedance-1-5-pro-251215",
  content: [{ type: "draft_task", draft_task: { id: draft.taskId } }],
});
await final.wait();

Seedance: Advanced Options

const task = await seedance.text2video.create({
  model: "doubao-seedance-1-5-pro-251215",
  content: [{ type: "text", text: "a cinematic sunset over the ocean" }],
  ratio: "21:9",
  resolution: "1080p",
  duration: 10,
  generate_audio: true,
  return_last_frame: true, // get last frame for chaining videos
  seed: 42,
  service_tier: "flex",    // 50% cost, higher TPD
});

Seedance: Models

| Model | ID | Capabilities | | :--- | :--- | :--- | | 2.0 | doubao-seedance-2-0-260128 | t2v, i2v, reference video/audio, audio gen | | 2.0 Fast | doubao-seedance-2-0-fast-260128 | Same as 2.0, faster | | 1.5 Pro | doubao-seedance-1-5-pro-251215 | t2v, i2v, draft mode, audio gen | | 1.0 Pro | doubao-seedance-1-0-pro-250428 | t2v, i2v (first+last frame) | | 1.0 Pro Fast | doubao-seedance-1-0-pro-fast-250428 | t2v, i2v (first frame only) | | 1.0 Lite T2V | doubao-seedance-1-0-lite-t2v-250428 | Text-to-video only | | 1.0 Lite I2V | doubao-seedance-1-0-lite-i2v-250428 | i2v (first, first+last, reference 1–4) |


Kling

import { createKling } from "ai-media-kit/providers/kling";

const kling = createKling({
  baseUrl: "https://api.klingai.com",
  apiKey: process.env.KLING_API_KEY,
});

Kling: Text to Video

const task = await kling.text2video.create({
  model_name: "kling-v2-6",
  prompt: "a golden retriever running on a beach at sunset",
  mode: "pro",
  aspect_ratio: "16:9",
  duration: "10",
  sound: "on",
});
await task.wait();

Kling: Image to Video

const task = await kling.image2video.create({
  model_name: "kling-v2-6",
  image: "https://example.com/dog.jpg",
  prompt: "the dog starts running towards the camera",
  mode: "pro",
  duration: "5",
  sound: "on",
});

Kling: Image to Video with Camera Control

const task = await kling.image2video.create({
  model_name: "kling-v2-6",
  image: "https://example.com/landscape.jpg",
  prompt: "sweeping landscape view",
  camera_control: {
    type: "simple",
    config: { horizontal: 5, zoom: -3 },
  },
});

Kling: Multi-Image to Video

const task = await kling.multiImage2video.create({
  model_name: "kling-v1-6",
  image_list: [
    { image: "https://example.com/frame1.jpg" },
    { image: "https://example.com/frame2.jpg" },
    { image: "https://example.com/frame3.jpg" },
  ],
  prompt: "smooth transition between the three scenes",
  mode: "pro",
  duration: "10",
});

Kling: Models

| Model | Text-to-Video | Image-to-Video | Sound | | :--- | :--- | :--- | :--- | | kling-v3 | Yes | Yes | — | | kling-v2-6 | Yes | Yes | Yes | | kling-v2-5-turbo | Yes | Yes | — | | kling-v2-1-master | Yes | Yes | — | | kling-v2-master | Yes | Yes | — | | kling-v1-6 | Yes | Yes + multi-image | — | | kling-v1 | Yes | Yes | — |


xAI (Grok Imagine Video)

import { createXai } from "ai-media-kit/providers/xai";

const xai = createXai({
  baseUrl: "https://api.x.ai",
  apiKey: process.env.XAI_API_KEY,
});

xAI: Text to Video

const task = await xai.text2video.create({
  model: "grok-imagine-video",
  prompt: "a cat dancing on the moon",
  duration: 10,
  aspect_ratio: "16:9",
  resolution: "720p",
});
await task.wait();

xAI: Image to Video

const task = await xai.image2video.create({
  model: "grok-imagine-video",
  prompt: "the cat starts walking towards the camera",
  image: { url: "https://example.com/cat.jpg" },
});

xAI: Reference Images

Use <IMAGE_1>, <IMAGE_2> in the prompt to refer to specific images. Ideal for virtual try-on, product placement, or character-consistent storytelling.

const task = await xai.referenceVideo.create({
  model: "grok-imagine-video",
  prompt: "<IMAGE_1> is walking in a park wearing the outfit from <IMAGE_2>",
  reference_images: [
    { url: "https://example.com/person.jpg" },
    { url: "https://example.com/outfit.jpg" },
  ],
});

xAI: Edit Video

High-fidelity edits with strong scene preservation.

const task = await xai.edit.create({
  model: "grok-imagine-video",
  prompt: "give the woman a silver necklace",
  video: { url: "https://example.com/video.mp4" },
});

xAI: Extend Video

Seamlessly extend an existing video. Output is one continuous video.

const task = await xai.extend.create({
  model: "grok-imagine-video",
  prompt: "the camera pans to reveal a vast mountain landscape",
  video: { url: "https://example.com/video.mp4" },
  duration: 5, // extension length (2–10s), added to original
});

Suno (Music Generation)

import { createSuno } from "ai-media-kit/providers/suno";

const suno = createSuno({
  baseUrl: "https://your-gateway.com",
  apiKey: process.env.SUNO_API_KEY,
  pollInterval: 10000,
});

Suno: Inspiration Mode

Describe what you want — AI generates lyrics, melody, everything.

const task = await suno.inspiration.create({
  mvVersion: "chirp-v5",
  inputType: "10",
  gptDescriptionPrompt: "a warm pop song about driving home at night",
});
await task.wait();
// task.outputs → AudioOutput[] with extras (cover art, music video)

Suno: Custom Lyrics Mode

const task = await suno.customLyrics.create({
  mvVersion: "chirp-v5",
  inputType: "20",
  prompt: "[Verse 1]\nCity lights flicker on\nThe highway hums along\n\n[Chorus]\nDriving home tonight...",
  tags: "pop,female voice,warm,acoustic",
  title: "Night Breeze",
});

Suno: Continue a Track

Extend an existing clip from a specific timestamp.

const task = await suno.continue.create({
  mvVersion: "chirp-v5",
  inputType: "20",
  prompt: "[Verse 2]\nMoonlight on the road...",
  continueClipId: "clip_xxx",
  continueAt: "27", // seconds
});

Suno: Cover Mode

const task = await suno.cover.create({
  mvVersion: "chirp-v5",
  inputType: "20",
  coverClipId: "clip_xxx",
  tags: "jazz,male voice",
});

Suno: Persona Mode

Generate music using a trained voice/style persona.

// Step 1: Create a persona from a reference clip
const persona = await suno.createPersona({
  root_clip_id: "clip_xxx",
  name: "My Voice",
});

// Step 2: Generate with persona
const task = await suno.persona.create({
  mvVersion: "chirp-v5",
  inputType: "20",
  prompt: "[Verse]\nNew lyrics here...",
  tags: "pop,upbeat",
  title: "New Song",
  task: "artist_consistency",
  metadataParams: {
    artist_clip_id: persona.artist_clip_id,
    persona_id: persona.persona_id,
  },
});

Suno: Sound Effects

const task = await suno.sound.create({
  mvVersion: "chirp-v5",
  inputType: "30",
  gptDescriptionPrompt: "ocean waves crashing on rocks with seagulls",
});

Suno: Upload & Concat

// Upload audio by URL — get a clipId for continue/cover
const clip = await suno.uploadByUrl("https://example.com/song.mp3");
console.log(clip.clipId);

// Upload a local file
const clip = await suno.upload(file);

// Assemble clips into a full-length song
await suno.concat(clip.clipId);

Gemini (Image Generation)

Gemini image generation is synchronouscreate() blocks until the image is ready (typically 5–30s), then returns the result directly. No Task, no polling.

import { createGemini } from "ai-media-kit/providers/gemini";

const gemini = createGemini({
  baseUrl: "https://generativelanguage.googleapis.com",
  apiKey: process.env.GEMINI_API_KEY,
});

Gemini: Text to Image

const result = await gemini.generateContent.create({
  model: "gemini-3.1-flash-image-preview",
  contents: [{ parts: [{ text: "A cat astronaut on the moon" }] }],
});

if (result.status === "completed") {
  console.log(result.outputs); // [{ type: "image", url: "data:image/png;base64,..." }]
} else {
  console.error(result.error); // { code, message }
}

Gemini: Image Editing

const result = await gemini.generateContent.create({
  model: "gemini-3.1-flash-image-preview",
  contents: [{
    parts: [
      { text: "Add a wizard hat to this cat" },
      { inlineData: { mimeType: "image/png", data: base64ImageData } },
    ],
  }],
  generationConfig: {
    responseModalities: ["TEXT", "IMAGE"],
    imageConfig: { aspectRatio: "1:1", imageSize: "2K" },
  },
});

Gemini: Google Search Grounding

const result = await gemini.generateContent.create({
  model: "gemini-3.1-flash-image-preview",
  contents: [{ parts: [{ text: "Current weather in Tokyo as an infographic" }] }],
  generationConfig: { responseModalities: ["TEXT", "IMAGE"] },
  tools: [{ googleSearch: {} }],
});

Gemini: Models

| Model | ID | Resolution | Notes | | :--- | :--- | :--- | :--- | | Nano Banana 2 | gemini-3.1-flash-image-preview | 512/1K/2K/4K | Extended ratios, image search, configurable thinking | | Nano Banana Pro | gemini-3-pro-image-preview | 1K/2K/4K | Highest quality, thinking always on | | Nano Banana | gemini-2.5-flash-image | 1K only | Fastest, cheapest |


Task Lifecycle (Async Providers)

For async providers (Kling, Seedance, Seedance 2, xAI, Suno), create() submits a task and returns a Task object:

create() ──→ submitted ──→ queued ──→ processing ──→ completed
                                                  └─→ failed / cancelled / expired

Usage Patterns

const task = await provider.endpoint.create({ ... });

// 1. Block until done
await task.wait();

// 2. Single status check (no polling)
await task.refresh();

// 3. Event listener (no polling — call wait() or toSSEResponse() to start polling)
task.on("complete", (t) => console.log(t.outputs));
task.on("progress", (t) => console.log(t.progress));
task.on("error", (err) => console.error(err.message));

// 4. SSE Response for browsers (starts polling)
return task.toSSEResponse();

Unified Result Shape

task.taskId    // string — provider-assigned task ID
task.status    // "submitted" | "queued" | "processing" | "completed" | "failed" | "cancelled" | "expired"
task.progress  // number | null (null when provider doesn't report progress)
task.outputs   // MediaOutput[] | null
task.error     // { code, message } | null
task.raw       // Raw upstream response, as-is

Resuming Tasks

const task = provider.endpoint.fromId("existing-task-id");
await task.refresh();  // one-shot check
await task.wait();     // or resume polling

SSE Event Format

// SSE events sent by toSSEResponse():
// event: progress → { taskId, status, progress }
// event: complete → { taskId, status, outputs }
// event: error    → { taskId, status, message, code, raw }

Error Handling

import { APIError, TaskError } from "ai-media-kit";

try {
  const task = await provider.endpoint.create({ ... });
  await task.wait();
} catch (err) {
  if (err instanceof TaskError) {
    // Business failure: task failed, cancelled, or expired
    console.error(err.code, err.message, err.raw);
  }
  if (err instanceof APIError) {
    // Protocol error: HTTP 401, 429, 500, malformed response
    console.error(err.status, err.message, err.raw);
  }
}

Provider Config

All providers accept a ProviderConfig:

{
  baseUrl: string;       // Official API or your proxy/gateway
  apiKey: string;
  auth?: AuthConfig;     // Default: Bearer token
  defaultHeaders?: Record<string, string | null>;
  pollInterval?: number; // Polling interval in ms (default: 2000)
  debug?: boolean;       // Log requests & polling (default: false)
}

Imports

// Import from root
import { createKling, createSeedance, createSeedance2, createXai, createSuno, createGemini } from "ai-media-kit";

// Or import individual providers (tree-shakable)
import { createSeedance2 } from "ai-media-kit/providers/seedance2";
import { createKling } from "ai-media-kit/providers/kling";

// Import types
import type { Task, MediaOutput, VideoOutput, ImageOutput, AudioOutput, TaskStatus } from "ai-media-kit";

Development

bun install              # Install dependencies
bun run build            # TypeScript compile
bun run dev              # Dev server (watch mode)
bun run lint             # Biome check
bun run format           # Biome fix
bun test                 # Unit tests (mocked)
bun run test:integration # Integration tests (needs .env)
bun run clean            # Clean build artifacts

Copy .env.example to .env and fill in your API keys before running integration tests.

License

MIT