@tamasha/litellm-wrapper
v1.0.0
Published
Production-grade unified multimodal LLM client for the Tamasha LiteLLM proxy. Supports GPT and Gemini families through a single API surface with adapter, transformer, and strategy patterns.
Downloads
233
Readme
@tamasha-packages/litellm-wrapper
Production-grade unified multimodal LLM client for the Tamasha LiteLLM proxy. One API surface, multiple model families (gpt, gemini), full TypeScript types, ESM + CJS dual-published.
Features
- No-
modelFamilyAPI: pass just the model id; the package looks up the family in its built-in catalog. - Adapter / Transformer / Strategy patterns keep family-specific concerns out of the public API.
- Realtime WebSocket client for streaming/voice — single
RealtimeSessionAPI on top of GPT Realtime and Gemini Live. - Strict TypeScript (
strict: true,noUncheckedIndexedAccess), Zod-validated config, structured pino logging, p-retry with exponential backoff. - Pluggable: register custom families via
registerFamily(...)and custom models viaregisterModel(...). - Tested: Vitest unit + integration + e2e suites with v8 coverage thresholds.
- Node 20+ ESM-first, CJS supplied via dual build.
Install
npm install @tamasha-packages/litellm-wrapper
# Optional: enables GPT video frame extraction
npm install ffmpeg-staticQuick start
import createClient, { TextTask } from '@tamasha-packages/litellm-wrapper';
const client = createClient({
model: 'gpt-4o', // family inferred from model id
litellmUrl: process.env.LITELLM_URL!, // http(s), no trailing slash, no query/fragment
apiKey: process.env.LITELLM_KEY!, // ≥8 chars, no whitespace
});
const summary = await client.analyzeText(
'Tamasha is a hyper-local audio social network.',
TextTask.Summarize,
);
const vision = await client.analyzeImage(
'https://signed.example.com/image.png',
'What is in this image?',
);
const generated = await client.generateImage('A cinematic Mumbai skyline at dusk', {
size: '1024x1024',
});chat()
import createClient from '@tamasha-packages/litellm-wrapper';
const client = createClient({
model: 'gpt-4o',
litellmUrl: process.env.LITELLM_URL!,
apiKey: process.env.LITELLM_KEY!,
});
const result = await client.chat({
systemPrompt: 'You are a translator.',
prompt: 'Translate "hello" to French.',
outputFormat: {
type: 'json_schema',
name: 'translation',
strict: true,
schema: {
type: 'object',
properties: { fr: { type: 'string' } },
required: ['fr'],
additionalProperties: false,
},
},
});
const { fr } = JSON.parse(result.content);input accepts tagged content ({kind:'image', value, mimeType?}), outputModalities: ['image','text'] requests image output (e.g. with gemini-2.5-flash-image).
createEphemeralSession()
import createClient from '@tamasha-packages/litellm-wrapper';
const client = createClient({
model: 'gpt-4o-realtime-preview',
litellmUrl: process.env.LITELLM_URL!,
apiKey: process.env.LITELLM_KEY!,
});
const { id, ephemeralKey } = await client.createEphemeralSession();
// Send `ephemeralKey` to your client app; the client opens its own WebSocket.Distinct from createRealtimeClient() which opens a server-side WebSocket. Use createEphemeralSession() when the client app handles the WebSocket directly.
Built-in models
The wrapper uses actual model identifiers (not LiteLLM model_name aliases) so that family is obvious from inspection. Configure your LiteLLM proxy with model_name entries that match these identifiers exactly.
| Family | Models |
| --- | --- |
| GPT (gpt) | gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4, gpt-3.5-turbo, gpt-5-nano, gpt-4o-realtime-preview, gpt-4o-mini-realtime-preview, whisper-1, dall-e-3, dall-e-2, gpt-image-1 |
| Gemini (gemini) | gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite, gemini-2.0-flash, gemini-2.0-flash-lite, gemini-1.5-pro, gemini-1.5-flash, gemini-2.0-flash-live-001, gemini-3.1-flash-live-preview, gemini-2.5-flash-image, gemini-2.5-flash-image-preview |
Get the live list at runtime:
import { listSupportedModels, listModelsByFamily, ModelFamily } from '@tamasha-packages/litellm-wrapper';
listSupportedModels(); // → ['dall-e-2', 'dall-e-3', ...]
listModelsByFamily(ModelFamily.GEMINI); // → ['gemini-1.5-flash', ...]Adding a new model
Two ways.
1. Edit the built-in list
src/core/model-catalog.ts is the source of truth. Append the new identifier to the right family array:
export const BUILTIN_GPT_MODELS = [
'gpt-4o',
'gpt-4o-mini',
// ...
'gpt-5', // ← add here
] as const;The catalog and family resolution pick it up automatically.
2. Register at runtime (no fork needed)
import { registerModel, ModelFamily } from '@tamasha-packages/litellm-wrapper';
registerModel('o3-mini', ModelFamily.GPT);
registerModel('gemini-3-pro', ModelFamily.GEMINI);
const client = createClient({ model: 'o3-mini', litellmUrl, apiKey });For a brand-new family altogether, register a provider too:
import { registerFamily, registerModel } from '@tamasha-packages/litellm-wrapper';
registerFamily('claude', (deps) => new ClaudeFamilyProvider(deps));
registerModel('claude-opus-4-7', 'claude');Realtime / voice
import { createRealtimeClient, RealtimeEventType } from '@tamasha-packages/litellm-wrapper';
const realtime = createRealtimeClient({
model: 'gpt-4o-realtime-preview', // family inferred from model id
litellmUrl: process.env.LITELLM_URL!,
apiKey: process.env.LITELLM_KEY!,
voiceMode: true,
});
const session = await realtime.connect({ reconnect: true });
session.on(RealtimeEventType.TextDelta, (chunk) => process.stdout.write(String(chunk)));
session.on(RealtimeEventType.AudioBuffer, (frame) => playPcm(frame));
session.send({ type: RealtimeEventType.TextDelta, data: 'Tell me a 30s bedtime story.' });Public API
createClient(config, options?) (default export)
Returns a TamashaLitellmClient. Equivalent to new TamashaLitellmClient(config, options).
TamashaLitellmClient
| Method | Signature |
| --- | --- |
| analyzeVideo(input, prompt, options?) | Promise<UnifiedResponse> |
| analyzeAudio(input, prompt, options?) | Promise<UnifiedResponse> |
| analyzeText(text, task, options?) | Promise<UnifiedResponse> |
| analyzeImage(input, prompt, options?) | Promise<UnifiedResponse> |
| generateImage(prompt, options?) | Promise<UnifiedResponse> |
| getConfig() | Readonly<ResolvedClientConfig> (includes derived modelFamily) |
| getAdapter() | active IFamilyAdapter |
TamashaRealtimeClient
| Method | Signature |
| --- | --- |
| connect({ reconnect? }) | Promise<RealtimeSession> |
| getConfig() | Readonly<ResolvedRealtimeConfig> |
RealtimeSession
send(event), on(type, fn) → unsubscribe, off, close.
Errors
All errors extend LitellmWrapperError and carry a stable code:
| Class | Code |
| --- | --- |
| ConfigurationError | CONFIGURATION_ERROR |
| UnsupportedFamilyError | UNSUPPORTED_FAMILY |
| UnsupportedModelError | UNSUPPORTED_MODEL |
| ValidationError | VALIDATION_ERROR |
| ProxyRequestError | PROXY_REQUEST_ERROR |
| TimeoutError | TIMEOUT_ERROR |
| RealtimeError | REALTIME_ERROR |
| TransformError | TRANSFORM_ERROR |
Configuration reference
interface ClientConfig {
model: string; // looked up in catalog → derives family
litellmUrl: string; // http(s), no trailing slash, no query/fragment
apiKey: string; // ≥8 printable-ASCII chars, no whitespace
maxRetries?: number; // 0..10, default 3
timeout?: number; // 1..600_000 ms, default 30_000
logger?: LoggerLike; // pino-compatible
metrics?: MetricsHook;
}
interface RealtimeConfig extends ClientConfig {
voiceMode?: boolean;
}Validation rules
| Field | Rule |
| --- | --- |
| model | must be present in defaultModelCatalog. Throws UnsupportedModelError otherwise. |
| litellmUrl | absolute http or https, has a host, no trailing slash, no query string, no fragment. |
| apiKey | ≥8 printable-ASCII chars, no whitespace. |
| maxRetries | integer, 0–10. |
| timeout | integer, 1–600 000 ms. |
Validation runs at construction time so misconfiguration fails fast.
Logging
Bring your own pino-compatible logger via config.logger. Without one, a pino logger is used; the level can be set with LITELLM_WRAPPER_LOG_LEVEL=debug|info|warn|error|silent.
Metrics hooks
config.metrics = {
onRequestStart: ({ type, family, model }) => stats.inc('llm.start', { type, family, model }),
onRequestEnd: ({ ok, durationMs, ... }) => stats.timing('llm.end', durationMs, { ok }),
};Development
npm install
npm run lint
npm run typecheck
npm run test
npm run test:coverage
npm run buildLicense
MIT © Tamasha Team
