@flink-app/openai-adapter

v2.0.0-alpha.56

Published

4 days ago

OpenAI adapter for Flink AI framework

0High
0Medium
0Low

joelso

jenkins-frost

johanfrost

flink openai gpt ai llm adapter

@flink-app/openai-adapter

OpenAI adapter for the Flink AI framework using the Responses API - OpenAI's modern API that provides step-aware reasoning, explicit tool invocation, and better performance.

Why Responses API?

This adapter uses OpenAI's Responses API instead of the older Chat Completions API, providing:

🔧 Step-aware reasoning: Model returns multiple tool calls as explicit, typed items in a single response
⚡ Better performance: 3% improvement on SWE-bench, 5% on TAUBench vs Chat Completions
💰 Lower costs: 40-80% better cache utilization via response persistence
📦 First-class tool steps: Tool calls and results are structured items, not message hacks
🎯 Future-proof: All new OpenAI features will land in Responses API first

Important: Understanding the Agent Loop

The Responses API does NOT run your agent loop.

What it provides:

Multiple tool calls in one response (you still execute them)
Structured step types (message, function_call, function_call_output)
Optional response persistence for caching

What Flink handles (via AgentRunner):

Tool execution
Multi-turn loops (API → execute tools → API → execute tools → done)
Deciding when to stop
Managing conversation state

This separation is intentional - it gives you full control over agent behavior while leveraging better API primitives.

Installation

npm install @flink-app/openai-adapter
# or
pnpm add @flink-app/openai-adapter

The openai package is included as a dependency, so you don't need to install it separately.

Usage

Basic Setup

import { OpenAIAdapter } from "@flink-app/openai-adapter";
import { FlinkApp } from "@flink-app/flink";

const app = new FlinkApp({
  ai: {
    llms: {
      default: new OpenAIAdapter({
        apiKey: process.env.OPENAI_API_KEY!,
        model: "gpt-5"
      }),
    },
  },
});

await app.start();

Legacy API (still supported):

// Backward-compatible constructor
new OpenAIAdapter(process.env.OPENAI_API_KEY!, "gpt-5")

Agent Instructions

Define your agent's behavior using the instructions property:

// src/agents/support_agent.ts
export const Agent: FlinkAgentProps = {
  name: "support_agent",
  instructions: "You are a helpful customer support agent.",
  tools: ["get_order_status"],
  model: { adapterId: "default" },
};

How it works:

Instructions are prepended as a system message to every conversation
Follows Vercel AI SDK pattern for consistency
Provides stable agent behavior across all interactions

Dynamic Context with System Messages

For per-request context, add system messages to the conversation:

const result = await ctx.agents.myAgent.execute({
  message: [
    { role: "system", content: "Current user tier: Premium" },
    { role: "user", content: "What can I do?" }
  ]
});

Order of messages sent to OpenAI:

Agent instructions (as system message)
User-provided system messages (if any)
Conversation messages

This gives you both static agent behavior and dynamic per-request context.

Structured Outputs

Enable structured outputs with JSON schema for 100% reliability in output format:

import { OpenAIAdapter } from "@flink-app/openai-adapter";

const adapter = new OpenAIAdapter({
  apiKey: process.env.OPENAI_API_KEY!,
  model: "gpt-5",
  structuredOutput: {
    type: "json_schema",
    name: "car_analysis",
    description: "Analysis of car specifications",
    schema: {
      type: "object",
      properties: {
        brand: { type: "string" },
        model: { type: "string" },
        year: { type: "number" },
        features: {
          type: "array",
          items: { type: "string" }
        }
      },
      required: ["brand", "model", "year"],
      additionalProperties: false
    },
    strict: true // Enforces 100% schema adherence
  }
});

Benefits of Structured Outputs:

100% reliability (vs ~95% with JSON mode)
No need for retry logic or manual validation
Automatic schema validation during generation
Supported on all modern models: gpt-4.1, gpt-5, o4-mini, o3

Zero Data Retention (ZDR)

For organizations with compliance or data retention requirements:

const adapter = new OpenAIAdapter({
  apiKey: process.env.OPENAI_API_KEY!,
  model: "gpt-5",
  persistResponse: false // Don't store responses on OpenAI servers
});

What persistResponse does:

true (default): OpenAI stores the response for caching and retrieval via response_id
false: No data stored on OpenAI servers (ZDR compliance)

What it does NOT do:

It does NOT automatically manage conversation state
You still need to pass full conversation history in messages
It's purely about server-side response persistence

Note: OpenAI automatically enforces persistResponse: false for Zero Data Retention organizations.

Multiple Adapters

You can register multiple OpenAI adapters with different configurations:

import { OpenAIAdapter } from "@flink-app/openai-adapter";
import { FlinkApp } from "@flink-app/flink";

const app = new FlinkApp({
  ai: {
    llms: {
      // Default GPT-5 - best for general tasks
      default: new OpenAIAdapter({
        apiKey: process.env.OPENAI_API_KEY!,
        model: "gpt-5"
      }),

      // Fast reasoning model - cost-efficient
      fast: new OpenAIAdapter({
        apiKey: process.env.OPENAI_API_KEY!,
        model: "o4-mini"
      }),

      // Maximum intelligence for complex reasoning
      smart: new OpenAIAdapter({
        apiKey: process.env.OPENAI_API_KEY!,
        model: "o3"
      }),

      // With structured output
      structured: new OpenAIAdapter({
        apiKey: process.env.OPENAI_API_KEY!,
        model: "gpt-5",
        structuredOutput: {
          type: "json_schema",
          name: "response",
          schema: { /* your schema */ },
          strict: true
        }
      }),
    },
  },
});

Using in Agents

Reference the adapter by its registered ID in your agent configuration:

// src/agents/support_agent.ts
import { FlinkAgentProps } from "@flink-app/flink";

export const Agent: FlinkAgentProps = {
  name: "support_agent",
  description: "Customer support assistant",
  instructions: "You are a helpful customer support agent.",
  tools: ["get_order_status", "search_knowledge_base"],
  model: {
    adapterId: "default", // Uses the "default" adapter
    maxTokens: 2000,
    temperature: 0.7,
  },
};

Supported Models

This adapter works with all OpenAI models available via the Responses API. The latest models (as of 2026) offer significant improvements:

GPT-5 Series (Recommended)

GPT-5: gpt-5
- Latest and most capable model
- Best for general-purpose applications
- Excellent at coding, reasoning, and agentic tasks

GPT-4.1 Series

GPT-4.1: gpt-4.1
- Smartest non-reasoning model
- Excellent at coding tasks
- Strong at precise instruction following
- Best for web development and technical tasks
GPT-4.1 mini: gpt-4.1-mini
- Smaller, faster, more cost-efficient
- Good balance of capability and cost
GPT-4.1 nano: gpt-4.1-nano
- Ultra-fast and cost-efficient
- Best for simple, high-volume tasks

O-Series Reasoning Models

o4-mini: o4-mini (recommended for reasoning tasks)
- Fast, cost-efficient reasoning model
- Best-performing on AIME 2024 and 2025 benchmarks
- Optimized for mathematical and logical reasoning
o3: o3
- Advanced reasoning model for complex tasks
- State-of-the-art performance on coding, math, and science
- Excellent at Codeforces, SWE-bench, and MMMU
o3-pro: o3-pro
- Premium reasoning model (Pro users only)
- Designed to think longer and provide most reliable responses

Legacy Models

For backwards compatibility:

GPT-4 Turbo: gpt-4-turbo
GPT-4: gpt-4
GPT-3.5 Turbo: gpt-3.5-turbo

Note: Some older models (GPT-4o, early GPT-4.1 variants) are being retired in 2026. Migrate to the latest models for continued support.

Model Selection Guide

| Use Case | Recommended Model | Why | |----------|------------------|-----| | General development | gpt-5 | Latest and most capable | | Coding & technical | gpt-4.1 | Best instruction following | | High-volume tasks | gpt-4.1-mini | Cost-efficient with good performance | | Mathematical reasoning | o4-mini | Optimized for math, fast and cost-efficient | | Complex problem-solving | o3 | State-of-the-art reasoning | | Mission-critical | o3-pro | Maximum reliability (Pro users) |

Features

✅ Step-based tool calling - Multiple tool calls in one response as typed items
✅ Event-based streaming - Proper event taxonomy (not just token streaming)
✅ Structured outputs with JSON schema (100% reliability)
✅ Response persistence for better caching (optional)
✅ First-class tool steps - function_call and function_call_output as explicit types
✅ Zero Data Retention mode for compliance
✅ Support for all OpenAI models
✅ 40-80% cost savings via better caching
✅ 3-5% performance improvement over Chat Completions

What Makes Responses API Different

The key difference is how tool calls are represented, not who executes them:

Chat Completions (old):

Response: {
  message: {
    content: "...",
    tool_calls: [...]  // Tool calls embedded in message
  }
}

You: Extract tool calls, execute them, create new messages, call API again

Responses API (new):

Response: {
  output: [
    { type: "message", content: "..." },
    { type: "function_call", name: "...", call_id: "..." },
    { type: "function_call", name: "...", call_id: "..." }  // Multiple calls!
  ]
}

You: Extract tool calls, execute them, create function_call_output items, call API again

Key improvements:

Multiple tool calls per response: Model can request several tools at once
Explicit step types: No more message role gymnastics
Better structured: function_call_output vs cramming into user messages
Clearer semantics: Steps are first-class, not message metadata

Still your responsibility:

Executing the tools
Deciding when to stop the loop
Managing conversation history

API

`OpenAIAdapter`

interface OpenAIAdapterOptions {
  apiKey: string;
  model: string;
  structuredOutput?: {
    type: "json_schema";
    name: string;
    description?: string;
    schema: Record<string, any>;
    strict?: boolean;
  };
  persistResponse?: boolean; // Default: true
}

class OpenAIAdapter implements LLMAdapter {
  constructor(options: OpenAIAdapterOptions);
  constructor(apiKey: string, model: string); // Legacy
}

Parameters

apiKey: Your OpenAI API key
model: The OpenAI model to use (e.g., "gpt-5", "o4-mini")
structuredOutput: Optional JSON schema for structured outputs
persistResponse: Whether to persist responses on OpenAI servers for caching (default: true)

Architecture Notes

Responses API vs Chat Completions

This adapter uses OpenAI's Responses API, which differs from Chat Completions in several ways:

Request Format:

Chat Completions: messages array with system/user/assistant roles
Responses API: input array with typed items (messages, function_call_outputs, etc.) + separate instructions field

Response Format:

Chat Completions: choices[0].message.content
Responses API: output array of items with type message, function_call, etc.

Tool/Function Format:

Chat Completions: Externally-tagged { type: "function", function: {...} }
Responses API: Internally-tagged { type: "function", name: "...", ... } (strict by default)

Structured Outputs:

Chat Completions: response_format: { type: "json_schema", json_schema: {...} }
Responses API: text: { format: { type: "json_schema", ... } }

State Management:

Chat Completions: Manual conversation state management
Responses API: Optional response persistence with persistResponse: true (for caching, not automatic state replay)

Flink Integration

The adapter seamlessly integrates with Flink's LLMAdapter interface:

Flink's instructions → Responses API instructions
Flink's messages → Converted to Responses API input items (typed steps)
Flink's tool schema → Converted to Responses API function format (internally-tagged, strict by default)
Responses output → Extracted to Flink's LLMResponse format

Each API call is one turn:

Flink calls adapter.execute() → One Responses API request
Response may contain multiple tool calls (as separate items)
Flink's AgentRunner executes those tools
Flink calls adapter.execute() again with tool results → Another Responses API request
Repeat until no more tool calls

This is the standard agent loop architecture used by modern frameworks (LangGraph, Vercel AI SDK, etc.)

Migration from Chat Completions

If you're coming from the Chat Completions API, the good news is: no code changes needed!

The adapter handles all the differences internally:

// This works the same way with both APIs
const app = new FlinkApp({
  ai: {
    llms: {
      default: new OpenAIAdapter({
        apiKey: process.env.OPENAI_API_KEY!,
        model: "gpt-5" // Just update the model
      }),
    },
  },
});

Benefits of upgrading:

✅ Step-aware reasoning (multiple tool calls per response)
✅ Better performance (3-5% improvement on benchmarks)
✅ Lower costs (40-80% better caching via response persistence)
✅ First-class tool steps (cleaner than message role hacks)
✅ Future-proof (new features land here first)

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@flink-app/openai-adapter

Why Responses API?

Important: Understanding the Agent Loop

Installation

Usage

Basic Setup

Agent Instructions

Dynamic Context with System Messages

Structured Outputs

Zero Data Retention (ZDR)

Multiple Adapters

Using in Agents

Supported Models

GPT-5 Series (Recommended)

GPT-4.1 Series

O-Series Reasoning Models

Legacy Models

Model Selection Guide

Features

What Makes Responses API Different

API

OpenAIAdapter

Parameters

Architecture Notes

Responses API vs Chat Completions

Flink Integration

Migration from Chat Completions

Requirements

License

Resources

`OpenAIAdapter`