@workkit/ai
v0.1.1
Published
Typed helpers for Cloudflare Workers AI — type-safe model calls, streaming, fallback chains
Downloads
262
Maintainers
Readme
@workkit/ai
Typed Workers AI client with streaming, fallback chains, and retry
Install
bun add @workkit/aiUsage
Before (raw Workers AI)
// Untyped, no error handling, no retry, no fallback
const result = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", {
messages: [{ role: "user", content: "Hello" }],
}) // any — what shape is this?
// Streaming requires manual ReadableStream handling
// No fallback if a model is down
// No retry on transient failuresAfter (workkit ai)
import { ai, streamAI, fallback, withRetry } from "@workkit/ai"
const client = ai(env.AI)
// Typed inference
const result = await client.run("@cf/meta/llama-3.1-8b-instruct", {
messages: [{ role: "user", content: "Hello" }],
})
// result.data — typed output
// result.model — model that was used
// Streaming
const stream = streamAI(env.AI, "@cf/meta/llama-3.1-8b-instruct", {
messages: [{ role: "user", content: "Tell me a story" }],
})
return new Response(stream, { headers: { "Content-Type": "text/event-stream" } })
// Fallback chain — try models in order
const response = await fallback(env.AI, {
models: [
"@cf/meta/llama-3.1-70b-instruct",
"@cf/meta/llama-3.1-8b-instruct",
],
input: { messages: [{ role: "user", content: "Hello" }] },
})
// Automatic retry with backoff
const retried = await withRetry(
() => client.run("@cf/meta/llama-3.1-8b-instruct", { messages }),
{ maxRetries: 3, backoff: "exponential" },
)API
Client
ai(binding)— Create a typed AI client fromenv.AI.run(model, inputs, opts?)— Run inference, returnsAiResult<T>
Streaming
streamAI(binding, model, inputs, opts?)— Returns aReadableStreamfor SSE
Fallback
fallback(binding, options)— Try models in order until one succeeds
Retry
withRetry(fn, options)— Retry with configurable backoff (exponential,linear,fixed)calculateDelay(attempt, options)— Calculate retry delaydefaultIsRetryable(error)— Default retry eligibility check
Utilities
estimateTokens(text)— Rough token count estimation
License
MIT
