getaiapi

v2.1.1

Published

2 months ago

Unified AI API Gateway - one function to call any AI model

Downloads

298

0High
0Medium
0Low

arash_k

ai api gateway fal-ai replicate wavespeed openrouter image-generation text-generation text-to-image llm unified-api

getaiapi

Typed AI provider SDKs. One import per provider.

Each AI provider gets a typed namespace with one function per model. No generic generate(), no model strings, no mapping layers. What you type is what gets sent.

Install

npm install getaiapi

Kling AI

69 generation models plus 45 management, list, and query functions. Each model is a typed function with Kling-native field names.

Setup

export KLING_ACCESS_KEY="your-access-key"
export KLING_SECRET_KEY="your-secret-key"

Or configure programmatically:

import { kling } from 'getaiapi'

kling.configure({ accessKey: '...', secretKey: '...' })

Text to Video

9 models: V1 Standard, V1.6 Pro/Standard, V2 Master, V2.1 Master, V2.5 Turbo Pro, V2.6 Pro, V3 Pro/Standard.

import { kling } from 'getaiapi'

const result = await kling.textToVideoV3Pro({
  prompt: 'a golden retriever running on a beach at sunset',
  duration: '5',
  aspect_ratio: '16:9',
  sound: 'on',
})

console.log(result.videos[0].url)

| Function | Model | Mode | |----------|-------|------| | textToVideoV1Standard | kling-v1 | std | | textToVideoV1_6Pro | kling-v1-6 | pro | | textToVideoV1_6Standard | kling-v1-6 | std | | textToVideoV2Master | kling-v2-master | — | | textToVideoV2_1Master | kling-v2-1-master | — | | textToVideoV2_5TurboPro | kling-v2-5-turbo | pro | | textToVideoV2_6Pro | kling-v2-6 | pro | | textToVideoV3Pro | kling-v3 | pro | | textToVideoV3Standard | kling-v3 | std |

Input: TextToVideoInput

{
  prompt: string              // required
  negative_prompt?: string
  duration?: string           // '5' or '10'
  aspect_ratio?: string       // '16:9', '9:16', '1:1'
  cfg_scale?: number
  sound?: 'on' | 'off'       // generate audio
}

Image to Video

13 models: V1 Standard, V1.5 Pro, V1.6 Pro/Standard, V2 Master, V2.1 Master/Pro/Standard, V2.5 Turbo Pro/Standard, V2.6 Pro, V3 Pro/Standard.

const result = await kling.imageToVideoV3Pro({
  image: 'https://example.com/photo.jpg',
  prompt: 'animate this photo with gentle wind',
  duration: '5',
})

| Function | Model | Mode | |----------|-------|------| | imageToVideoV1Standard | kling-v1 | std | | imageToVideoV1_5Pro | kling-v1-5 | pro | | imageToVideoV1_6Pro | kling-v1-6 | pro | | imageToVideoV1_6Standard | kling-v1-6 | std | | imageToVideoV2Master | kling-v2-master | — | | imageToVideoV2_1Master | kling-v2-1-master | — | | imageToVideoV2_1Pro | kling-v2-1 | pro | | imageToVideoV2_1Standard | kling-v2-1 | std | | imageToVideoV2_5TurboPro | kling-v2-5-turbo | pro | | imageToVideoV2_5TurboStandard | kling-v2-5-turbo | std | | imageToVideoV2_6Pro | kling-v2-6 | pro | | imageToVideoV3Pro | kling-v3 | pro | | imageToVideoV3Standard | kling-v3 | std |

Input: ImageToVideoInput

{
  image: string               // required — URL or base64
  prompt?: string
  negative_prompt?: string
  duration?: string
  aspect_ratio?: string
  cfg_scale?: number
  sound?: 'on' | 'off'
  image_tail?: string         // end frame image URL
  voice_list?: Array<{ voice_id: string }>   // mutually exclusive with element_list
  element_list?: Array<{ element_id: number }> // mutually exclusive with voice_list
}

Omni Video

17 models across O1 and O3 variants. Supports text-to-video, image-to-video, reference-to-video, video editing, and video reference — all through one endpoint.

const result = await kling.omniVideoO3ProTextToVideo({
  prompt: 'a cyberpunk city at night',
  duration: '5',
  aspect_ratio: '16:9',
})

| Function | Model | Mode | |----------|-------|------| | omniVideoO1ImageToVideo | kling-video-o1 | — | | omniVideoO1ReferenceToVideo | kling-video-o1 | — | | omniVideoO1StandardImageToVideo | kling-video-o1 | std | | omniVideoO1StandardReferenceToVideo | kling-video-o1 | std | | omniVideoO1StandardVideoEdit | kling-video-o1 | std | | omniVideoO1StandardVideoReference | kling-video-o1 | std | | omniVideoO1VideoEdit | kling-video-o1 | — | | omniVideoO1VideoReference | kling-video-o1 | — | | omniVideoO3ProImageToVideo | kling-v3-omni | pro | | omniVideoO3ProReferenceToVideo | kling-v3-omni | pro | | omniVideoO3ProTextToVideo | kling-v3-omni | pro | | omniVideoO3ProVideoEdit | kling-v3-omni | pro | | omniVideoO3ProVideoReference | kling-v3-omni | pro | | omniVideoO3StandardReferenceToVideo | kling-v3-omni | std | | omniVideoO3StandardTextToVideo | kling-v3-omni | std | | omniVideoO3StandardVideoEdit | kling-v3-omni | std | | omniVideoO3StandardVideoReference | kling-v3-omni | std |

Input: OmniVideoInput

{
  prompt: string              // required
  image?: string
  negative_prompt?: string
  duration?: string
  aspect_ratio?: string
  cfg_scale?: number
  sound?: 'on' | 'off'
  element_list?: Array<{ element_id: number }>
}

Image Generation

2 models on v1/images/generations and 3 models on v1/images/omni-image.

const result = await kling.imageO1({
  prompt: 'a watercolor painting of a mountain lake',
  n: 2,
  aspect_ratio: '16:9',
})

console.log(result.images[0].url)

| Function | Endpoint | Model | |----------|----------|-------| | imageV3TextToImage | generations | kling-v3 | | imageV3ImageToImage | generations | kling-v3 | | imageO1 | omni-image | kling-image-o1 | | imageO3TextToImage | omni-image | kling-v3-omni | | imageO3ImageToImage | omni-image | kling-v3-omni |

Input: ImageGenerationInput / OmniImageInput

{
  prompt: string              // required
  image?: string              // for image-to-image
  n?: number                  // number of outputs
  aspect_ratio?: string
}

Virtual Try-On

const result = await kling.virtualTryOn({
  human_image: 'https://example.com/person.jpg',
  cloth_image: 'https://example.com/shirt.jpg',
})

Input: VirtualTryOnInput

{
  human_image: string         // required
  cloth_image: string         // required
}

AI Avatar

4 models: V1 Pro/Standard, V2 Pro/Standard.

const result = await kling.avatarV2Pro({
  image: 'https://example.com/portrait.jpg',
  sound_file: 'https://example.com/speech.mp3',
  prompt: 'talking head presentation',
})

| Function | Mode | |----------|------| | avatarV1Pro | pro | | avatarV1Standard | std | | avatarV2Pro | pro | | avatarV2Standard | std |

Input: AvatarInput

{
  image: string               // required — portrait image
  sound_file?: string         // audio for lip sync
  prompt?: string
}

Lip Sync

const result = await kling.lipSyncAudioToVideo({
  sound_file: 'https://example.com/speech.mp3',
})

| Function | Description | |----------|-------------| | lipSyncAudioToVideo | Audio-driven lip sync | | lipSyncTextToVideo | Text-driven lip sync |

Input: LipSyncInput

{
  sound_file?: string         // audio URL
}

Video Effects

4 models: V1 Standard, V1.5 Pro, V1.6 Pro/Standard.

const result = await kling.effectsV1_6Pro({
  image: 'https://example.com/photo.jpg',
})

| Function | |----------| | effectsV1Standard | | effectsV1_5Pro | | effectsV1_6Pro | | effectsV1_6Standard |

Input: EffectsInput

{
  image: string               // required
}

Motion Control

4 models: V2.6 Pro/Standard, V3 Pro/Standard.

const result = await kling.motionControlV3Pro({
  image_url: 'https://example.com/scene.jpg',
  prompt: 'camera pan left',
})

| Function | Model | Mode | |----------|-------|------| | motionControlV2_6Pro | kling-v2-6 | pro | | motionControlV2_6Standard | kling-v2-6 | std | | motionControlV3Pro | kling-v3 | pro | | motionControlV3Standard | kling-v3 | std |

Input: MotionControlInput

{
  image_url: string           // required
  video_url?: string
  prompt?: string
  keep_original_sound?: boolean
  character_orientation?: string
  element_list?: Array<{ element_id: number }>
}

Text to Speech (Sync)

Returns immediately — no polling.

const result = await kling.tts({ text: 'Hello world' })
console.log(result.audios[0].url)

Input: TtsInput

{
  text: string                // required
}

Video to Audio

Generates audio for a video. Returns both the merged video and the generated audio tracks.

const result = await kling.videoToAudio({
  video_url: 'https://example.com/video.mp4',
  sound_effect_prompt: 'ocean waves crashing',
})

console.log(result.videos[0].url)       // merged video with audio
console.log(result.audios[0].url_mp3)   // audio track (mp3)
console.log(result.audios[0].url_wav)   // audio track (wav)

Input: VideoToAudioInput

{
  video_url?: string          // mutually exclusive with video_id
  video_id?: string           // mutually exclusive with video_url
  sound_effect_prompt?: string
  bgm_prompt?: string         // background music prompt
  asmr_mode?: boolean         // enhanced detailed sound effects
}

Text to Audio

const result = await kling.textToAudio({
  prompt: 'thunderstorm with heavy rain',
  duration: 5.0,
})

console.log(result.audios[0].url)       // normalized from url_mp3
console.log(result.audios[0].url_mp3)   // mp3 URL
console.log(result.audios[0].url_wav)   // wav URL

Input: TextToAudioInput

{
  prompt: string              // required
  duration: number            // required — 3.0 to 10.0
}

Voice Clone

const result = await kling.createVoice({
  voice_name: 'my-voice',
  voice_url: 'https://example.com/sample.mp3',
})

console.log(result.voices[0].voice_id)
console.log(result.voices[0].trial_url)

Input: CreateVoiceInput

{
  voice_name: string          // required
  voice_url?: string          // audio sample URL
  video_id?: string           // or extract from video
}

Multi-Shot

Generate multi-angle reference images from a frontal image. Each image returns 3 angle variants.

const result = await kling.multiShot({
  element_frontal_image: 'https://example.com/face.jpg',
})

console.log(result.images[0].url_1)  // angle 1
console.log(result.images[0].url_2)  // angle 2
console.log(result.images[0].url_3)  // angle 3

Input: MultiShotInput

{
  element_frontal_image: string  // required
}

Reference to Image

const result = await kling.referenceToImage({
  prompt: 'portrait in watercolor style',
  n: 2,
})

Input: ReferenceToImageInput

{
  prompt: string              // required
  n?: number
  aspect_ratio?: string
}

Expand Image

Outpainting — expand an image beyond its borders.

const result = await kling.expandImage({
  image: 'https://example.com/photo.jpg',
  prompt: 'extend the landscape',
})

Input: ExpandImageInput

{
  image: string               // required
  prompt?: string
  n?: number
}

Extend Video

Continue a video beyond its last frame.

const result = await kling.extendVideo({
  prompt: 'the camera continues to pan right',
})

Input: ExtendVideoInput

{
  prompt?: string
  negative_prompt?: string
}

Identify Face (Sync)

Detect faces in a video for lip-sync targeting. Returns immediately — no polling.

const result = await kling.identifyFace({
  video_url: 'https://example.com/video.mp4',
})

console.log(result.session_id)
result.face_data.forEach(face => {
  console.log(face.face_id, face.face_image, face.start_time, face.end_time)
})

Input: IdentifyFaceInput

{
  video_url?: string          // mutually exclusive with video_id
  video_id?: string           // mutually exclusive with video_url
}

Image Recognize (Sync)

Returns immediately — no polling.

const result = await kling.imageRecognize({
  image: 'https://example.com/photo.jpg',
})

Input: ImageRecognizeInput

{
  image: string               // required
}

Account Costs

Query resource package balances under your account. Free to call; QPS ≤ 1. Note: remaining_quantity has a 12-hour reporting delay.

const result = await kling.accountCosts({
  start_time: Date.now() - 86_400_000, // last 24h
  end_time: Date.now(),
})

for (const pack of result.resource_pack_subscribe_infos) {
  console.log(pack.resource_pack_name, pack.remaining_quantity, pack.status)
}

Input: AccountCostsInput

{
  start_time: number             // required — Unix ms
  end_time: number               // required — Unix ms
  resource_pack_name?: string    // optional — filter by exact package name
}

Output: AccountCostsResult

{
  resource_pack_subscribe_infos: Array<{
    resource_pack_name: string
    resource_pack_id: string
    resource_pack_type: 'decreasing_total' | 'constant_period'
    total_quantity: number
    remaining_quantity: number   // 12h delay
    purchase_time: number
    effective_time: number
    invalid_time: number
    status: 'toBeOnline' | 'online' | 'expired' | 'runOut'
  }>
}

Element Library

Create reusable characters/objects (elements) from images or video, then reference them in generation tasks via element_list.

// Create a custom element (async — polls until ready)
const el = await kling.createElement({
  element_name: 'My Character',
  element_description: 'A hero in a red cape',
  reference_type: 'image_refer',
  element_image_list: {
    frontal_image: 'https://example.com/frontal.jpg',
    refer_images: [{ image_url: 'https://example.com/side.jpg' }],
  },
})

// Use element_id in video generation
await kling.imageToVideoV3Pro({
  image: 'https://example.com/scene.jpg',
  prompt: 'Character walks forward',
  element_list: [{ element_id: Number(el.element_id) }],
})

// List all custom elements (paginated)
const { elements } = await kling.listElements({ pageNum: 1, pageSize: 30 })

// List official preset elements
const { elements: presets } = await kling.listPresetElements()

// Delete a custom element
await kling.deleteElement({ element_id: el.element_id })

createElement input: CreateElementInput

{
  element_name: string                          // required — max 20 chars
  element_description: string                   // required — max 100 chars
  reference_type: 'image_refer' | 'video_refer' // required
  element_image_list?: {                        // required when image_refer
    frontal_image: string                       // front-facing image URL or base64
    refer_images?: Array<{ image_url: string }> // 1–3 additional angles
  }
  element_video_list?: {                        // required when video_refer
    refer_videos: Array<{ video_url: string }>  // 1 video, .mp4/.mov, 3–8s
  }
  element_voice_id?: string
  tag_list?: Array<{ tag_id: string }>          // o_101–o_108
  callback_url?: string
  external_task_id?: string
  timeout?: number                              // poll timeout ms
}

createElement output: ElementResult

{
  element_id: string
  element_name: string
  element_description: string
  reference_type: 'image_refer' | 'video_refer'
  status: string
  owned_by?: string
  element_voice_id?: string
  tag_list?: Array<{ tag_id: string; tag_name?: string }>
  element_image_list?: { frontal_image: string; refer_images?: Array<{ image_url: string }> }
  element_video_list?: { refer_videos: Array<{ video_url: string }> }
}

Voice Management

import { kling } from 'getaiapi'

// List custom voices (paginated)
const { voices } = await kling.listVoices({ pageNum: 1, pageSize: 30 })

// List preset voices from Kling's library
const { voices: presets } = await kling.listPresetVoices()

// Query a single voice creation task
const result = await kling.queryVoice('task-id')

// Delete a custom voice
await kling.deleteVoice('voice-id')

Character Speaking with Custom Voice

element_list and voice_list are mutually exclusive on all video endpoints — you cannot pass both at once. To get a character (element) to speak in their own custom voice, use a two-step approach: generate TTS audio first, then drive an avatar with the character's image.

Use case A — Character video with custom voice (avatar)

import { kling } from 'getaiapi'

// Step 1: Generate speech audio from your custom voice
const audio = await kling.tts({
  text: 'Hello, welcome to my world.',
  voice_id: 'your-custom-voice-id',
  voice_language: 'en',
})

// Step 2: Animate the character image with that audio (lip-synced)
const video = await kling.avatarV2Pro({
  image: element.element_image_list.frontal_image, // element's frontal image
  audio_id: audio.audios[0].id,                   // TTS result audio ID
  prompt: 'looking at the camera, friendly expression',
})

console.log(video.videos[0].url)

Use case B — Character video with element (no custom voice)

When you only need visual character consistency and don't need a specific voice:

const video = await kling.imageToVideoV3Pro({
  image: 'https://example.com/scene.jpg',
  prompt: 'Character walks through a forest',
  element_list: [{ element_id: Number(el.element_id) }],
  sound: 'on',  // Kling generates audio automatically
})

Use case C — Character video with voice (no element)

When you only need a specific voice track and don't need element-based character consistency:

const video = await kling.imageToVideoV3Pro({
  image: 'https://example.com/character.jpg',
  prompt: '<<<voice_1>>> Hello, welcome to my world.',
  voice_list: [{ voice_id: 'your-custom-voice-id' }],
})

Multi-Elements Video Workflow

import { kling } from 'getaiapi'

// Step 1: Initialize video
const { session_id } = await kling.initMultiElementsSelection({ video_url: 'https://...' })

// Step 2: Click points to select an area
await kling.addSelectionArea({ session_id, frame_index: 10, points: [{ x: 0.5, y: 0.5 }] })

// Step 5: Preview selection
const preview = await kling.previewSelection({ session_id })

// Step 6: Generate edited video (polls until complete)
const video = await kling.generateMultiElementsVideo({
  session_id,
  edit_mode: 'swap',
  image_list: [{ image: 'https://...' }],
  prompt: 'swap <<<image_1>>> for element from <<<video_1>>>',
})

List & Query Historical Tasks

Every generation endpoint has a list function (paginated) and a single-task query function:

// List recent tasks (all generation types)
const { tasks } = await kling.listImageToVideoTasks({ pageNum: 1, pageSize: 20 })
const { tasks: videoTasks } = await kling.listTextToAudioTasks()

// Query a single task result by ID (returns same typed result as the generation function)
const video = await kling.getImageToVideoTask('task-id')     // KlingVideoResult
const audio = await kling.getTextToAudioTask('task-id')      // KlingAudioResult
const image = await kling.getImageGenerationTask('task-id')  // KlingImageResult

Available list functions: listLipSyncTasks, listTextToAudioTasks, listVideoEffectsTasks, listImageGenerationTasks, listOmniVideoTasks, listMultiShotTasks, listImageToVideoTasks, listOmniImageTasks, listReferenceToImageTasks, listVirtualTryOnTasks, listMotionControlTasks, listExtendVideoTasks, listAvatarTasks.

Available query functions: getLipSyncTask, getTextToAudioTask, getVideoEffectsTask, getImageGenerationTask, getOmniVideoTask, getMultiShotTask, getImageToVideoTask, getOmniImageTask, getReferenceToImageTask, getVirtualTryOnTask, getMotionControlTask, getExtendVideoTask, getAvatarTask.

Output Types

All functions return typed results based on output modality:

// Video endpoints (textToVideo, imageToVideo, omniVideo, avatar, lipSync, effects, motionControl, extendVideo)
interface KlingVideoResult {
  task_id: string
  videos: Array<{ id: string; url: string; duration: string }>
}

// Image endpoints (imageGeneration, omniImage, virtualTryOn, referenceToImage, expandImage)
interface KlingImageResult {
  task_id: string
  images: Array<{ index: number; url: string }>
}

// Audio endpoints (tts, textToAudio)
interface KlingAudioResult {
  task_id: string
  audios: Array<{ id: string; url: string; url_mp3?: string; url_wav?: string; duration?: string; duration_mp3?: string; duration_wav?: string }>
}

// Multi-shot endpoint — 3 angle URLs per image
interface KlingMultiShotResult {
  task_id: string
  images: Array<{ index: number; url_1: string; url_2: string; url_3: string }>
}

// Voice clone endpoint
interface KlingVoiceResult {
  task_id: string
  voices: Array<{ voice_id: string; voice_name: string; trial_url: string; owned_by: string }>
}

// Video-to-audio endpoint — merged video + generated audio
interface KlingVideoAudioResult {
  task_id: string
  videos: Array<{ id: string; url: string; duration: string }>
  audios: Array<{ id: string; url_mp3?: string; url_wav?: string; duration_mp3?: string; duration_wav?: string }>
}

// Face detection (identifyFace) — sync, no task_id
interface KlingFaceResult {
  session_id: string
  face_data: Array<{ face_id: string; face_image: string; start_time: number; end_time: number }>
}

// Generic JSON (imageRecognize)
interface KlingJsonResult {
  task_id: string
  data: unknown
}

Polling Control

All functions accept optional polling parameters:

await kling.textToVideoV3Pro({
  prompt: 'a sunset',
  timeout: 600_000,     // max wait time in ms (default: 300_000 = 5 min)
  pollInterval: 5_000,  // poll frequency in ms (default: 3_000)
})

Sync endpoints (tts, imageRecognize, identifyFace) return immediately regardless of these settings.

Extra Parameters

All input types accept additional Kling-native fields via index signature. Pass any parameter the Kling API supports:

await kling.textToVideoV3Pro({
  prompt: 'a sunset',
  camera_control: { type: 'simple', config: { horizontal: 5 } },
  callback_url: 'https://example.com/webhook',
})

Error Handling

import { kling, KlingAuthError, KlingTimeoutError, KlingTaskFailedError } from 'getaiapi'

try {
  await kling.textToVideoV3Pro({ prompt: 'test' })
} catch (err) {
  if (err instanceof KlingAuthError) {
    // Missing or invalid credentials
  }
  if (err instanceof KlingTimeoutError) {
    // Task took too long (increase timeout)
  }
  if (err instanceof KlingTaskFailedError) {
    // Kling rejected the task (content violation, bad params, etc.)
    console.error(err.taskId, err.message)
  }
}

| Error | Code | When | |-------|------|------| | KlingAuthError | AUTH_ERROR | Missing credentials or 401 response | | KlingRateLimitError | RATE_LIMIT | HTTP 429 or body codes 1100-1102 | | KlingApiError | API_ERROR | Provider returned an error | | KlingTimeoutError | TIMEOUT | Polling exceeded timeout | | KlingTaskFailedError | TASK_FAILED | Task status is 'failed' |

All errors extend KlingError which extends Error.

Deprecated: v1 Unified Gateway

The previous generate(), submit(), poll() APIs and the multi-provider registry are deprecated but still exported for backward compatibility. They will be removed in the next major version.

// Deprecated — still works but will be removed
import { generate } from 'getaiapi'
await generate({ model: 'flux-schnell', prompt: '...' })

// New — use provider-specific typed functions
import { kling } from 'getaiapi'
await kling.textToVideoV3Pro({ prompt: '...' })

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

getaiapi

Install

Kling AI

Setup

Text to Video

Image to Video

Omni Video

Image Generation

Virtual Try-On

AI Avatar

Lip Sync

Video Effects

Motion Control

Text to Speech (Sync)

Video to Audio

Text to Audio

Voice Clone

Multi-Shot

Reference to Image

Expand Image

Extend Video

Identify Face (Sync)

Image Recognize (Sync)

Account Costs

Element Library

Voice Management

Character Speaking with Custom Voice

Multi-Elements Video Workflow

List & Query Historical Tasks

Output Types

Polling Control

Extra Parameters

Error Handling

Deprecated: v1 Unified Gateway

License