@abdoseadaa/convai

v0.1.4

Published

21 hours ago

Typed shorthand SDK for OpenAI stateful conversations and stateless chat completions

0High
0Medium
0Low

abdoseadaa

openai conversations chat ai sdk gpt stateful context responses-api chat-completions typescript

convai

Typed shorthand SDK for the OpenAI Responses API and Chat Completions API.

convai wraps the official openai npm package with two things developers always end up writing themselves: persistent conversation context and clean shorthand functions that skip the boilerplate.

Every function exists in two forms — one that throws a typed SdkError, and one that returns { success, data, error } — so you pick the calling style that fits each use case.

Why convai?

The official openai package is complete but low-level. Using it for real products means writing the same wrappers repeatedly:

Chaining previous_response_id for multi-turn context
Parsing choices[0].message.content on every call
Building pagination loops for history
Handling rate_limit_exceeded vs insufficient_quota differently
Parsing tool call arguments from JSON strings
Accumulating streaming deltas into a full string

convai does all of that once, correctly, and exposes it as clean named functions with full TypeScript types.

Install

npm install @abdoseadaa/convai openai

openai >= 4.98.0 is a peer dependency — install it alongside convai.

Requirements: Node.js >= 18.0.0, TypeScript >= 5.0 (optional but recommended)

Quick Start

import { createClient } from '@abdoseadaa/convai'

const ai = createClient({
  apiKey: process.env.OPENAI_API_KEY!,
  model:  'gpt-4o',
})

// ── Stateful: context is managed by OpenAI ────────────────────────────────────
const { convId } = await ai.session.startSession('You are a senior MERN developer.')

const r1 = await ai.shorthand.ask(convId, 'What DB handles high write throughput best?')
const r2 = await ai.shorthand.ask(convId, 'How does that compare to PostgreSQL?')
// r2 has full context — no messages[] array to manage

// ── Stateless: one-off calls with no context ──────────────────────────────────
const label   = await ai.chat_shorthand.classify('I love this product!', ['positive', 'negative', 'neutral'])
const summary = await ai.chat_shorthand.summarize(longText, 'bullet')
const french  = await ai.chat_shorthand.translate('Hello world', 'French')

Two APIs, One Client

convai covers both OpenAI conversation patterns:

| | Stateful | Stateless | |---|---|---| | Context | OpenAI manages it server-side | You own messages[] | | API used | Responses API (previous_response_id) | Chat Completions API | | Namespace | ai.conversation, ai.chat, ai.session, ai.shorthand, etc. | ai.completion, ai.chat_shorthand | | Best for | Chatbots, assistants, multi-turn agents | Classification, transforms, one-shot prompts |

Two Layers, Every Function

Every function in convai exists in two forms. Pick by context:

Layer 1 — throws `SdkError` (use in services and composed logic)

try {
  const { convId } = await ai.session.startSession('You are a helpful assistant.')
  const reply = await ai.shorthand.ask(convId, 'Explain closures in JavaScript.')
  console.log(reply)
} catch (err) {
  const e = err as SdkError
  console.log(e.formatted.code)        // 'RATE_LIMITED'
  console.log(e.formatted.message)     // 'Rate limit exceeded — too many requests...'
  console.log(e.formatted.hint)        // 'Back off and retry. Check retryAfterMs...'
  console.log(e.formatted.retryable)   // true
  console.log(e.formatted.retryAfterMs) // 12000
  console.log(e.formatted.requestId)   // 'req_abc123' — for OpenAI support tickets
  console.log(e.raw)                   // original OpenAI APIError, untouched
}

Layer 2 — returns `SdkResult<T>` (use in route controllers, never throws)

const result = await ai.safe.shorthand.ask(convId, 'Explain closures in JavaScript.')

if (result.success) {
  console.log(result.data)   // string — TypeScript narrows this, no cast needed
} else {
  console.log(result.error.code)       // ErrorCode enum value
  console.log(result.error.hint)       // actionable next step
  console.log(result.error.retryable)  // boolean
}

The SdkResult<T> type is a discriminated union — TypeScript automatically narrows result.data to T when result.success is true, and result.error to FormattedError when false. No null checks on both sides.

// TypeScript enforces the check — this won't compile:
result.data.someField    // ❌ Error: data may be null

// This works:
if (result.success) {
  result.data.someField  // ✅ TypeScript knows data is T here
}

Access the safe layer via ai.safe.* — it mirrors every namespace:

ai.safe.conversation.*
ai.safe.chat.*
ai.safe.session.*
ai.safe.shorthand.*
ai.safe.historyOps.*
ai.safe.tokens.*
ai.safe.multi.*
ai.safe.completion.*
ai.safe.chat_shorthand.*

createClient(config)

import { createClient } from '@abdoseadaa/convai'

const ai = createClient({
  apiKey:        string    // Required. Your OpenAI API key.
  model?:        string    // Default model for all calls. Default: 'gpt-4o'
  defaultSystem?: string   // Default system prompt for session.startSession()
  store?:        boolean   // Store responses server-side. Default: true
  maxRetries?:   number    // Retries on 429/500/503. Default: 2
  timeoutMs?:    number    // Request timeout in ms. Default: 30000
})

Stateful API Reference

`ai.conversation.*` — conversation container CRUD

Manages the conversation object — the root of every multi-turn chain.

Creating a conversation is a local, zero-cost operation — it mints a stable synthetic id (conv_<uuid>) and registers in-process state (chain head, system prompt, transcript). No model call is made until the first chat.send().

// Create a new conversation. Returns { id, created_at, metadata? }. No API call.
const conv = await ai.conversation.create()
const conv = await ai.conversation.create({ metadata: { userId: 'u_123' } })

// Retrieve conversation metadata by ID (local read)
const conv = await ai.conversation.get(convId)

// Update metadata on a conversation (local)
const conv = await ai.conversation.update(convId, { topic: 'typescript', status: 'active' })

// Delete a conversation — clears local state and best-effort deletes the
// latest stored response from OpenAI. Returns { id, deleted }.
await ai.conversation.remove(convId)

`ai.chat.*` — raw AI response calls

Low-level calls — return the full OpenAI Response object. Use ai.shorthand.* for extracted text.

// Send a message — context is maintained automatically via previous_response_id
const response = await ai.chat.send(convId, 'What is a closure?')
const response = await ai.chat.send(convId, 'What is a closure?', {
  model:        'gpt-4o-mini',  // override model for this call
  instructions: 'Be concise.',  // one-off system override
  maxTokens:    500,
  temperature:  0.3,
})

// Send with a temporary system instructions override for this turn only
const response = await ai.chat.sendWithSystem(convId, 'Reply only in bullet points.', 'List 5 JS tips.')

// Get the raw async stream iterable — use ai.shorthand.streamAsk() for a simpler API
const stream = await ai.chat.stream(convId, 'Write a sorting algorithm.')

`ai.history.*` — conversation item management

Read and write individual items (messages) inside a conversation. History is backed by the conversation's local transcript — the Responses API can't return a full transcript in a single call, so convai records each turn as it happens.

// List items in a conversation — returns one page: { data, has_more, last_id }
const page = await ai.history.list(convId)
const page = await ai.history.list(convId, { limit: 50, order: 'asc', after: cursorId })

// Retrieve a single transcript item by ID
const item = await ai.history.getItem(convId, itemId)

// addItem annotates the LOCAL transcript only — no model call.
// A 'system' item also updates the conversation's persistent system prompt,
// so it is applied on every subsequent turn.
await ai.history.addItem(convId, { role: 'user',      content: 'A note for the record.' })
await ai.history.addItem(convId, { role: 'assistant', content: 'Understood.' })
await ai.history.addItem(convId, { role: 'system',    content: 'New persistent instructions.' })

// To inject messages the MODEL should treat as prior context, use
// historyOps.injectContext (it chains them into the response chain).

// Delete a specific item from the local transcript
await ai.history.removeItem(convId, itemId)

`ai.response.*` — response object management

Manage the individual stored response objects that form a conversation chain.

// Retrieve a stored response by its ID
const res = await ai.response.get(responseId)

// Delete a stored response
await ai.response.remove(responseId)

// Cancel an in-flight background response
await ai.response.cancel(responseId)

// Compact the context of a response — reduces token cost on subsequent turns
await ai.response.compact(responseId)

// List input items that were sent to a specific response (useful for debugging)
const items = await ai.response.getItems(responseId)

// Get the input token count for a response
const { input_tokens } = await ai.response.getTokenCount(responseId)

`ai.session.*` — session lifecycle shortcuts

High-level composed functions. Each replaces 2–4 raw API calls.

// quickChat — one-off question: creates conv → sends → returns text → deletes conv
// Creating the conv is local, so this costs a single model call.
const reply = await ai.session.quickChat('What is the capital of France?')
const reply = await ai.session.quickChat('Explain async/await.', { model: 'gpt-4o-mini' })

// startSession — create a conversation and set a PERSISTENT system prompt.
// The prompt is re-applied as instructions on every turn (no extra model call).
// Returns { convId, createdAt }. Store convId for all subsequent calls.
const { convId } = await ai.session.startSession()
const { convId } = await ai.session.startSession('You are a code reviewer. Be strict and concise.')
const { convId } = await ai.session.startSession('You are a helpful assistant.', {
  metadata: { userId: 'u_123', topic: 'code-review' }
})

// resetSession — delete a conversation and create a fresh one
// Useful for "clear chat" features. Returns the new convId.
const newConvId = await ai.session.resetSession(oldConvId)
const newConvId = await ai.session.resetSession(oldConvId, 'New system prompt for the fresh session.')

// cloneSession — branch a conversation into a new one.
// The clone shares the source's server-side chain head, so it continues with
// full prior context but diverges independently from the next turn onward.
const branchedConvId = await ai.session.cloneSession(sourceConvId)

`ai.shorthand.*` — stateful chat shortcuts

The most common operations — each returns a clean value with no parsing needed.

// ask — send a message and get the reply as a plain string
const reply = await ai.shorthand.ask(convId, 'What is TypeScript?')
const reply = await ai.shorthand.ask(convId, 'What is TypeScript?', { temperature: 0 })

// askWithSystem — one-off instructions override for a single turn
const reply = await ai.shorthand.askWithSystem(
  convId,
  'Reply only in Spanish.',
  'What is the weather like today?'
)

// streamAsk — stream the response, call onChunk for each delta, return full text when done
const fullText = await ai.shorthand.streamAsk(
  convId,
  'Write a merge sort implementation in TypeScript.',
  (chunk) => process.stdout.write(chunk)  // called for each text delta
)

// retry — cancel (if in-flight) + delete a response and resend
// Covers the "regenerate response" UX pattern.
const newResponse = await ai.shorthand.retry(convId, lastResponseId, 'Try again with more detail.')

`ai.historyOps.*` — history operations

Composed functions for reading, writing, and managing conversation history at scale.

// getFullHistory — fetch ALL items in a conversation with automatic pagination
// Returns a flat array — cursor pagination is handled transparently.
const items = await ai.historyOps.getFullHistory(convId)

// getLastReply — get the most recent assistant message as a plain string
const lastReply = await ai.historyOps.getLastReply(convId)  // string | null

// injectContext — seed real model context with multiple messages in ONE call
// Chains the messages into the response chain so the model treats them as prior
// context on subsequent turns (unlike history.addItem, which is local-only).
await ai.historyOps.injectContext(convId, [
  { role: 'user',      content: 'My name is Alice.' },
  { role: 'assistant', content: 'Got it, Alice.' },
  { role: 'user',      content: 'I am a TypeScript developer.' },
])

// pruneHistory — delete the oldest items, keeping only the last N
// Fetch history first, pass it in to avoid double-fetching.
const history = await ai.historyOps.getFullHistory(convId)
const result  = await ai.historyOps.pruneHistory(convId, 20, history)
// result: { deleted: 45, remaining: 20 }

// exportTranscript — format the conversation as text, JSON, or Markdown
const markdown = await ai.historyOps.exportTranscript(convId, 'markdown')
const jsonStr  = await ai.historyOps.exportTranscript(convId, 'json')
const plainTxt = await ai.historyOps.exportTranscript(convId, 'text')

// Optionally pass pre-fetched history to avoid a second round-trip
const history  = await ai.historyOps.getFullHistory(convId)
const markdown = await ai.historyOps.exportTranscript(convId, 'markdown', history)

// summarizeAndCompress — LLM-summarize the conversation, then replace the local
// transcript with a single summary item. Shrinks what history reads return.
const { summary, itemsRemoved } = await ai.historyOps.summarizeAndCompress(convId)
// summary: "The conversation covered X, Y, and Z..."
// itemsRemoved: 87

`ai.tokens.*` — token and cost tracking

// getTokenUsage — aggregate input/output tokens across all responses in a conversation
const usage = await ai.tokens.getTokenUsage(convId)
// {
//   inputTokens:      12400,
//   outputTokens:     3200,
//   totalTokens:      15600,
//   estimatedCostUsd: 0.063   ← based on gpt-4o pricing
// }

// compactIfNeeded — compact only if the token count exceeds a threshold
// Pass the latest responseId and a token threshold. Default threshold: 50,000.
const result = await ai.tokens.compactIfNeeded(convId, latestResponseId)
const result = await ai.tokens.compactIfNeeded(convId, latestResponseId, 100_000)
// {
//   compacted:    true,
//   reason:       'Token count (62000) exceeded threshold (50000). Compaction applied.',
//   tokensBefore: 62000
// }

`ai.multi.*` — multi-conversation operations

// broadcast — send the same message to multiple conversations in parallel
// Errors per conversation are caught individually — one failure won't abort the rest.
const results = await ai.multi.broadcast([convId1, convId2, convId3], 'Summarize your topic.')
// [
//   { convId: '...', reply: 'This conversation covered...', error: null },
//   { convId: '...', reply: null, error: 'Rate limit exceeded...' },
//   { convId: '...', reply: 'Topics included...', error: null },
// ]

// deleteAll — delete multiple conversations in parallel
const results = await ai.multi.deleteAll([convId1, convId2, convId3])
// [
//   { convId: '...', deleted: true,  error: null },
//   { convId: '...', deleted: false, error: 'Not found' },
// ]

Stateless API Reference

`ai.completion.*` — raw chat completion calls

Low-level stateless calls — return the full ChatCompletion object.

// complete — raw call with full messages array
const res = await ai.completion.complete([
  { role: 'system', content: 'You are a helpful assistant.' },
  { role: 'user',   content: 'What is a closure?' },
])
const text = res.choices[0].message.content

// structured — enforce a JSON schema on the output
const res = await ai.completion.structured(
  [{ role: 'user', content: 'Extract the person info from: John Doe, age 32.' }],
  {
    name:   'person',
    schema: {
      type:       'object',
      properties: { name: { type: 'string' }, age: { type: 'number' } },
      required:   ['name', 'age'],
    },
  }
)

// withTools — function/tool calling, returns full ChatCompletion
const res = await ai.completion.withTools(messages, tools)

// vision — image analysis
const res = await ai.completion.vision('https://example.com/image.jpg', 'What is in this image?')

// stream — raw async stream iterable
const stream = await ai.completion.stream(messages)

// jsonMode — less strict JSON output without schema enforcement
const res = await ai.completion.jsonMode(messages)

`ai.chat_shorthand.*` — stateless shorthand functions

The main consumer-facing stateless API. Returns clean values — no parsing needed.

// ask — messages array → reply string
const reply = await ai.chat_shorthand.ask([
  { role: 'system', content: 'You are a helpful assistant.' },
  { role: 'user',   content: 'What is a closure?' },
])

// askOnce — single user message → reply string (simplest possible call)
const reply = await ai.chat_shorthand.askOnce('What is a closure?')
const reply = await ai.chat_shorthand.askOnce('What is a closure?', {
  system:      'Be concise.',
  temperature: 0,
  model:       'gpt-4o-mini',
})

// structured<T> — enforces JSON schema, returns parsed typed object
type Person = { name: string; age: number }
const person = await ai.chat_shorthand.structured<Person>(
  [{ role: 'user', content: 'Extract: John Doe, age 32.' }],
  {
    name:   'person',
    schema: {
      type:       'object',
      properties: { name: { type: 'string' }, age: { type: 'number' } },
      required:   ['name', 'age'],
    },
  }
)
// person.name === 'John Doe'
// person.age  === 32

// extractJSON<T> — JSON mode without schema, returns parsed object
type Tags = { tags: string[] }
const result = await ai.chat_shorthand.extractJSON<Tags>([
  { role: 'user', content: 'Return a JSON object with a tags array for: TypeScript, Node, MongoDB.' }
])
// result.tags === ['TypeScript', 'Node', 'MongoDB']

// withTools — function calling with pre-parsed args
const toolResult = await ai.chat_shorthand.withTools(messages, tools)
// {
//   calledTools: true,
//   toolCalls: [
//     { id: 'call_abc', name: 'get_weather', args: { city: 'Cairo' } }
//   ],
//   text: null,
//   raw: ChatCompletion
// }

// If the model didn't call a tool:
// { calledTools: false, toolCalls: [], text: 'Here is my answer...', raw: ... }

// vision — image URL or base64 → analysis string
const analysis = await ai.chat_shorthand.vision(
  'https://example.com/chart.png',
  'Describe the trend shown in this chart.',
  { detail: 'high' }  // 'auto' | 'low' | 'high'
)

// Base64 is also supported:
const analysis = await ai.chat_shorthand.vision(
  'data:image/jpeg;base64,/9j/4AAQSkZJR...',
  'What does this image show?'
)

// stream — messages → streaming with callback → full string returned
const fullText = await ai.chat_shorthand.stream(
  [{ role: 'user', content: 'Write a quicksort in TypeScript.' }],
  (chunk) => process.stdout.write(chunk)
)

// countTokens — estimate tokens before sending (sends 1-token request)
const { promptTokens, estimatedCostUsd } = await ai.chat_shorthand.countTokens([
  { role: 'system', content: 'You are a helpful assistant.' },
  { role: 'user',   content: longDocumentText },
])
// { promptTokens: 8420, estimatedCostUsd: 0.00002105 }

// classify — classify input into one of a set of labels
const label = await ai.chat_shorthand.classify(
  'I love this product, it works perfectly!',
  ['positive', 'negative', 'neutral']
)
// 'positive'

const label = await ai.chat_shorthand.classify(
  'The response time is very slow.',
  ['bug', 'feature-request', 'performance', 'docs'],
  'Classify this customer support ticket.',  // optional context
  { model: 'gpt-4o-mini' }
)

// summarize — summarize text in different styles
const brief   = await ai.chat_shorthand.summarize(longText)                     // 1-2 sentences
const bullets = await ai.chat_shorthand.summarize(longText, 'bullet')           // 3-5 bullet points
const detail  = await ai.chat_shorthand.summarize(longText, 'detailed')         // 2-3 paragraphs

// translate — translate text to any language
const spanish = await ai.chat_shorthand.translate('Hello, how are you?', 'Spanish')
const arabic  = await ai.chat_shorthand.translate('Good morning', 'Arabic')
const french  = await ai.chat_shorthand.translate(text, 'French', { model: 'gpt-4o-mini' })

Error Handling

The `SdkError` shape

Every error thrown by a Layer 1 function contains two things:

interface SdkError {
  formatted: FormattedError  // clean, typed, actionable
  raw:       unknown         // original OpenAI APIError — never swallowed
}

interface FormattedError {
  code:          ErrorCode        // SDK-level named code — switch on this
  status:        number           // HTTP status (0 for network errors)
  type:          string           // OpenAI's raw error.type string
  openaiCode:    string | null    // OpenAI's raw error.code string
  message:       string           // human-readable description
  hint:          string           // what to do about it
  retryable:     boolean          // safe to retry?
  retryAfterMs?: number           // from x-ratelimit-reset header — ms to wait
  requestId?:    string           // x-request-id — for OpenAI support tickets
  timestamp:     string           // ISO 8601 when the error occurred
}

`ErrorCode` enum — all 16 codes

import { ErrorCode } from '@abdoseadaa/convai'

// Auth (401) — never retry
ErrorCode.AUTH_INVALID_KEY      // bad, expired, or missing API key
ErrorCode.AUTH_NO_ORG           // account not part of an organization

// Access (403) — never retry
ErrorCode.ACCESS_DENIED         // no permission for the resource
ErrorCode.ACCESS_REGION_BLOCKED // region or country restriction
ErrorCode.ACCESS_IP_BLOCKED     // IP not on project allowlist

// Request (400 / 404 / 422) — never retry, fix the request
ErrorCode.NOT_FOUND             // conversation, item, or response not found
ErrorCode.BAD_REQUEST           // malformed parameters
ErrorCode.VALIDATION_FAILED     // schema or type mismatch (422)
ErrorCode.CONTEXT_TOO_LONG      // exceeds model's token limit
ErrorCode.INVALID_MODEL         // model doesn't exist or account lacks access

// Capacity (429) — RATE_LIMITED is retryable, QUOTA_EXCEEDED is not
ErrorCode.RATE_LIMITED          // RPM/TPM exceeded — retry with backoff
ErrorCode.QUOTA_EXCEEDED        // billing quota exhausted — fix billing, don't retry

// Server (500 / 502 / 503) — retry with backoff
ErrorCode.SERVER_ERROR          // internal server error (500/502)
ErrorCode.SERVICE_UNAVAILABLE   // overloaded (503)

// Network — retry
ErrorCode.CONNECTION_FAILED     // cannot reach the API
ErrorCode.TIMEOUT               // request timed out

// Fallback
ErrorCode.UNKNOWN               // catch-all for unexpected errors

Type guard helpers

import {
  isRetryable,
  isAuthError,
  isRateLimitError,
  isQuotaExceededError,
  isAccessError,
  isNotFoundError,
  isNetworkError,
  isServerError,
  toLogObject,
} from '@abdoseadaa/convai'

try {
  const reply = await ai.shorthand.ask(convId, message)
} catch (err) {
  const e = err as SdkError

  if (isRateLimitError(e)) {
    const wait = e.formatted.retryAfterMs ?? 5000
    await sleep(wait)
    // retry...
  }

  if (isQuotaExceededError(e)) {
    // Don't retry — this is a billing issue
    notifyOpsTeam('OpenAI quota exhausted')
    return
  }

  if (isAuthError(e)) {
    // Surface to user — won't resolve with retries
    throw new Error('AI service authentication failed. Contact support.')
  }

  if (isNetworkError(e) || isServerError(e)) {
    // Safe to retry with exponential backoff
    scheduleRetry(e.formatted.retryAfterMs)
  }

  // Log safely — strips raw to avoid logging API keys or sensitive headers
  logger.error('AI call failed', toLogObject(e))
}

Retry pattern example

const sleep = (ms: number) => new Promise(r => setTimeout(r, ms))

async function askWithRetry(convId: string, message: string, maxAttempts = 3) {
  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    const result = await ai.safe.shorthand.ask(convId, message)

    if (result.success) return result.data

    const { error } = result

    if (!error.retryable || attempt === maxAttempts) {
      throw new Error(`${error.code}: ${error.message}`)
    }

    const wait = error.retryAfterMs ?? Math.pow(2, attempt) * 1000
    await sleep(wait)
  }
}

TypeScript

convai is written in TypeScript with strict mode enabled. Full type definitions are included — no separate @types package needed.

Exported types

import type {
  // Client
  ConvSdkClient,       // return type of createClient()

  // Config and options
  ClientConfig,        // createClient() config
  ChatOptions,         // per-call options for stateful chat
  ConversationOptions, // conversation.create() options
  ListOptions,         // history.list() pagination options
  CompletionOptions,   // base options for all completion calls
  StructuredOptions,   // extends CompletionOptions — adds strict
  ToolOptions,         // extends CompletionOptions — adds toolChoice
  VisionOptions,       // extends CompletionOptions — adds detail
  ToolDefinition,      // OpenAI.ChatCompletionTool re-export
  CompletionMessage,   // OpenAI.ChatCompletionMessageParam re-export

  // Results
  SdkResult,           // discriminated union: { success, data, error }
  SdkError,            // { formatted: FormattedError, raw: unknown }
  FormattedError,      // the clean error shape
  WithToolsResult,     // return type of chat_shorthand.withTools()
  TokenUsage,          // return type of tokens.getTokenUsage()
  CompactionResult,    // return type of tokens.compactIfNeeded()
  TranscriptMessage,   // { role, text, itemId }
  RawItem,             // raw conversation item from OpenAI

  // Errors
  ErrorCode,           // enum of all 16 SDK error codes
} from '@abdoseadaa/convai'

Common Patterns

Chatbot with session persistence

import { createClient } from '@abdoseadaa/convai'

const ai = createClient({ apiKey: process.env.OPENAI_API_KEY! })

// On first message — store convId in your DB per user
const { convId } = await ai.session.startSession('You are a helpful assistant.')
await db.users.update({ id: userId }, { aiConvId: convId })

// On follow-up messages — load convId from DB
const user   = await db.users.findById(userId)
const reply  = await ai.shorthand.ask(user.aiConvId, userMessage)

// On "clear chat"
const newConvId = await ai.session.resetSession(user.aiConvId)
await db.users.update({ id: userId }, { aiConvId: newConvId })

Express route controller (safe layer)

import { createClient, isRateLimitError } from '@abdoseadaa/convai'
import type { SdkError } from '@abdoseadaa/convai'

const ai = createClient({ apiKey: process.env.OPENAI_API_KEY! })

app.post('/chat', async (req, res) => {
  const { convId, message } = req.body

  const result = await ai.safe.shorthand.ask(convId, message)

  if (!result.success) {
    const status = result.error.status || 500
    return res.status(status).json({
      error:     result.error.code,
      message:   result.error.message,
      retryable: result.error.retryable,
    })
  }

  res.json({ reply: result.data })
})

Function calling loop

const tools: ToolDefinition[] = [
  {
    type:     'function',
    function: {
      name:        'get_weather',
      description: 'Get current weather for a city',
      parameters: {
        type:       'object',
        properties: { city: { type: 'string' } },
        required:   ['city'],
      },
    },
  },
]

let messages: CompletionMessage[] = [
  { role: 'user', content: 'What is the weather in Cairo?' }
]

const result = await ai.chat_shorthand.withTools(messages, tools)

if (result.calledTools) {
  const call   = result.toolCalls[0]
  const weather = await getWeather(call.args.city as string)  // your function

  // Continue the loop with the tool result
  messages = [
    ...messages,
    { role: 'assistant', content: '', tool_calls: [{ id: call.id, type: 'function', function: { name: call.name, arguments: JSON.stringify(call.args) } }] },
    { role: 'tool', content: JSON.stringify(weather), tool_call_id: call.id },
  ]

  const finalReply = await ai.chat_shorthand.ask(messages)
  console.log(finalReply)
}

Auto-manage context length

// After every N turns, check token usage and compress if needed
const usage = await ai.tokens.getTokenUsage(convId)

if (usage.totalTokens > 80_000) {
  const { summary, itemsRemoved } = await ai.historyOps.summarizeAndCompress(convId)
  console.log(`Compressed: removed ${itemsRemoved} items. Summary: ${summary}`)
}

// Or use the automatic threshold helper
const { compacted } = await ai.tokens.compactIfNeeded(convId, latestResponseId, 80_000)

Broadcast to multiple conversations

// Run the same prompt across multiple user conversations in parallel
const userConvIds = await db.sessions.getActiveConvIds()

const results = await ai.multi.broadcast(userConvIds, 'System update: new features are available.')

const failed = results.filter(r => r.error !== null)
if (failed.length > 0) {
  logger.warn('Broadcast partial failure', { failed })
}

File Structure

convai/
├── dist/                             ← compiled output (what npm ships)
│   ├── index.js / index.d.ts        ← package entry
│   ├── createClient.js              ← factory function
│   ├── types/                       ← all interfaces and option types
│   ├── errors/                      ← error codes, handler, guards
│   ├── core/                        ← Layer 1: raw API calls
│   ├── composed/                    ← Layer 1: composed shorthand functions
│   ├── safe/                        ← Layer 2: SdkResult wrappers
│   └── utils/                       ← wrapSafe utility
├── README.md
├── LICENSE
└── package.json

Source code is excluded from the npm package — only dist/ is shipped.

Notes

Conversations API compatibility — the client.conversations.* API is not yet available in the current openai npm SDK (v4.x). convai emulates it using responses.create() with previous_response_id chaining and store: true — which is exactly what the Conversations API wraps internally. Each turn chains off the conversation's current head (the latest response id), so multi-turn context accumulates correctly. When client.conversations becomes available in a future SDK release, convai will upgrade internally with no breaking changes to the consumer API.

Conversation state & persistence — per-conversation state (the chain head, the persistent system prompt, and the local message transcript) is kept in-memory for the lifetime of the createClient() instance. It is not shared across processes or restarts. For durable chatbots, persist the convId plus your own message log in your database; on a new process, re-seed context with historyOps.injectContext() if you need the model to remember prior turns. The system prompt set by startSession() is re-applied as instructions on every turn, because the Responses API does not carry instructions across a previous_response_id chain.

Token cost estimates — getTokenUsage() and countTokens() use gpt-4o pricing ($2.50/1M input, $10.00/1M output) as defaults. Actual costs vary by model.

summarizeAndCompress() — uses the conversation's own context to summarize itself. For very long conversations, consider pruning first to avoid hitting the model's context limit before the summary call.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme