@epicdm/flowstate-agents-llm-client
v1.0.7
Published
Unified LLM API client for Epic Flow with multi-provider support
Maintainers
Readme
@epic-flow/llm-client
A unified, production-ready LLM client for Epic Flow with multi-provider support, streaming capabilities, automatic retries, and seamless integration with Agent Memory Server (AMS) and Knowledge Store.
Features
- Multi-Provider Support - Work with Anthropic Claude, OpenAI GPT, and LM Studio with a unified API
- Streaming Responses - Stream text generation for real-time user experiences
- Automatic Retry Logic - Exponential backoff with configurable retry strategies
- Cost Tracking - Automatic cost calculation for all providers
- Agent Memory Server (AMS) Integration - Maintain conversation context across sessions
- Knowledge Store Integration - Enhance prompts with relevant knowledge automatically
- TypeScript First - Full type safety with comprehensive type definitions
- Environment Agnostic - Works in Node.js and browser environments
- Event Hooks - Monitor and control request lifecycle with callbacks
- Conversation History - Automatic conversation management
- Budget Controls - Set spending limits and token caps
Installation
yarn add @epic-flow/llm-clientPeer Dependencies
The package has optional peer dependencies for advanced features:
# Optional: For AMS integration
yarn add @epic-flow/flowstate-agents-memory-client
# Optional: For Knowledge Store integration
yarn add @epic-flow/flowstate-agents-knowledge-storeQuick Start
Basic Query
import { LLMClient } from '@epic-flow/llm-client';
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: process.env.ANTHROPIC_API_KEY,
});
const result = await client.query({
prompt: 'What is the capital of France?',
});
console.log(result.content); // "The capital of France is Paris."
console.log(result.cost.totalCostUSD); // 0.0012Streaming Response
const client = new LLMClient({
provider: 'openai',
model: 'gpt-4o',
apiKey: process.env.OPENAI_API_KEY,
});
for await (const chunk of client.stream({
prompt: 'Write a short poem about coding',
})) {
if (chunk.type === 'text-delta') {
process.stdout.write(chunk.textDelta);
} else if (chunk.type === 'finish') {
console.log('\n\nCost:', chunk.cost.totalCostUSD);
}
}Provider Configuration
Anthropic Claude
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: process.env.ANTHROPIC_API_KEY,
temperature: 0.7,
maxTokens: 2000,
});Available Models:
claude-opus-4- Most powerful, best for complex tasksclaude-sonnet-4- Balanced performance and costclaude-haiku-4- Fast and cost-effective
OpenAI GPT
const client = new LLMClient({
provider: 'openai',
model: 'gpt-4o',
apiKey: process.env.OPENAI_API_KEY,
});Available Models:
gpt-4o- Latest GPT-4 optimizedgpt-4-turbo- High performance GPT-4gpt-3.5-turbo- Cost-effective option
LM Studio (Local Models)
const client = new LLMClient({
provider: 'lmstudio',
model: 'your-local-model',
baseUrl: 'http://localhost:1234/v1',
});Advanced Features
Retry Configuration
Automatic retry with exponential backoff for transient failures:
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: process.env.ANTHROPIC_API_KEY,
retry: {
maxAttempts: 3,
initialDelayMs: 1000,
maxDelayMs: 10000,
backoffMultiplier: 2,
onRetry: (attempt, error) => {
console.log(`Retry attempt ${attempt}:`, error.message);
},
},
});Agent Memory Server (AMS) Integration
Maintain conversation context across sessions:
import { createAMSClient } from '@epic-flow/flowstate-agents-memory-client';
const amsClient = createAMSClient({
serverUrl: 'http://localhost:7001',
apiKey: process.env.AMS_API_KEY,
});
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: process.env.ANTHROPIC_API_KEY,
ams: {
client: amsClient,
sessionId: 'user-session-123',
namespace: 'chat',
contextWindowMax: 10,
autoSave: true,
includeContextByDefault: true,
},
});
// AMS will automatically summarize and store conversation history
const result = await client.query({
prompt: 'What did we discuss earlier?',
includeAMSContext: true,
saveToAMS: true,
});Knowledge Store Integration
Enhance prompts with relevant knowledge automatically:
import { createKnowledgeStoreClient } from '@epic-flow/flowstate-agents-knowledge-store';
const knowledgeClient = createKnowledgeStoreClient({
serverUrl: 'http://localhost:7002',
});
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: process.env.ANTHROPIC_API_KEY,
knowledgeStore: {
client: knowledgeClient,
agentId: 'support-agent',
autoEnhancePrompt: true,
knowledgeTypes: ['policies', 'procedures'],
maxKnowledgeItems: 5,
minImportance: 0.7,
},
});
// Knowledge Store will automatically retrieve and inject relevant context
const result = await client.query({
prompt: 'What is our refund policy?',
});Event Hooks
Monitor and control the request lifecycle:
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: process.env.ANTHROPIC_API_KEY,
onRequest: (request) => {
console.log('Request started:', request.timestamp);
},
onResponse: (response) => {
console.log('Response received:', {
latency: response.latency,
tokens: response.tokens,
cost: response.cost,
});
},
onError: (error) => {
console.error('Request failed:', error);
},
onStreamStart: (event) => {
console.log('Stream started:', event);
},
onStreamEnd: (event) => {
console.log('Stream ended:', event);
},
});Budget Controls
Set spending limits and token caps:
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: process.env.ANTHROPIC_API_KEY,
budget: {
maxCostPerRequest: 0.10, // USD
maxTokensPerRequest: 4000,
dailyBudget: 10.00, // USD
onBudgetExceeded: (spent, limit) => {
console.warn(`Budget exceeded: $${spent} / $${limit}`);
},
},
});Conversation History Management
Access and manage conversation history:
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: process.env.ANTHROPIC_API_KEY,
});
// First message
await client.query({ prompt: 'Hello!' });
// Second message - context is maintained
await client.query({ prompt: 'What did I just say?' });
// Get conversation history
const history = client.getHistory();
console.log(history);
// [
// { role: 'user', content: 'Hello!' },
// { role: 'assistant', content: 'Hello! How can I help you today?' },
// { role: 'user', content: 'What did I just say?' },
// { role: 'assistant', content: 'You said "Hello!"' }
// ]
// Clear history for new conversation
client.clearHistory();Logging Configuration
Control logging behavior:
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: process.env.ANTHROPIC_API_KEY,
logging: {
level: 'debug',
logRequests: true,
logResponses: true,
logTokenUsage: true,
redactApiKey: true,
logger: {
debug: (msg, ...args) => console.debug(msg, ...args),
info: (msg, ...args) => console.info(msg, ...args),
warn: (msg, ...args) => console.warn(msg, ...args),
error: (msg, ...args) => console.error(msg, ...args),
},
},
});API Documentation
LLMClient
Constructor
new LLMClient(config: LLMClientConfig)Creates a new LLM client instance with the specified configuration.
Methods
query(params: QueryParams): Promise<LLMResult>
Execute a single query and get a complete response.
Parameters:
prompt(string, required) - The user promptsystemPrompt(string, optional) - System instructionstemperature(number, optional) - Override default temperature (0-1)maxTokens(number, optional) - Override default max tokenstools(Record<string, ToolDefinition>, optional) - Available toolstoolChoice('auto' | 'required' | 'none', optional) - Tool usage strategymaxToolRoundtrips(number, optional) - Max tool execution roundsincludeAMSContext(boolean, optional) - Include AMS contextsaveToAMS(boolean, optional) - Save to AMS after queryresponseFormat('text' | 'json', optional) - Response formatschema(ZodSchema, optional) - Validation schema for JSON responses
Returns: Promise<LLMResult>
content- The generated textusage- Token usage statisticscost- Cost breakdown in USDtoolCalls- Tool calls made (if any)metadata- Request metadata (provider, model, latency, etc.)
stream(params: QueryParams): AsyncGenerator<StreamChunk>
Stream text generation for real-time responses.
Parameters: Same as query()
Yields: StreamChunk
- Text chunks:
{ type: 'text-delta', textDelta: string } - Finish chunk:
{ type: 'finish', finishReason: string, usage: TokenUsage, cost: CostBreakdown }
getHistory(): Array<{ role: string; content: string }>
Get the current conversation history.
clearHistory(): void
Clear the conversation history.
Types
LLMClientConfig
interface LLMClientConfig {
provider: 'anthropic' | 'openai' | 'lmstudio';
model: string;
apiKey?: string;
baseUrl?: string;
environment?: 'node' | 'browser';
proxyUrl?: string;
temperature?: number;
maxTokens?: number;
retry?: RetryConfig;
logging?: LoggingConfig;
budget?: BudgetConfig;
onRequest?: (request: LLMRequest) => void;
onResponse?: (response: LLMResponse) => void;
onError?: (error: Error) => void;
onStreamStart?: (event: StreamStartEvent) => void;
onStreamEnd?: (event: StreamEndEvent) => void;
ams?: AMSConfig;
knowledgeStore?: KnowledgeStoreConfig;
}LLMResult
interface LLMResult {
content: string;
usage: TokenUsage;
cost: CostBreakdown;
toolCalls?: ToolCall[];
metadata: ResultMetadata;
}TokenUsage
interface TokenUsage {
inputTokens: number;
outputTokens: number;
cacheCreationTokens: number;
cacheReadTokens: number;
totalTokens: number;
}CostBreakdown
interface CostBreakdown {
inputCostUSD: number;
outputCostUSD: number;
cacheCostUSD: number;
totalCostUSD: number;
model: string;
currency: 'USD';
}Error Handling
The client throws LLMError instances with detailed error information:
import { LLMError, LLMErrorCode } from '@epic-flow/llm-client';
try {
const result = await client.query({ prompt: 'Test' });
} catch (error) {
if (error instanceof LLMError) {
console.error('Error code:', error.code);
console.error('Provider:', error.provider);
console.error('Retryable:', error.retryable);
console.error('Original error:', error.originalError);
}
}Error Codes:
RATE_LIMIT- Rate limit exceededINVALID_API_KEY- API key is invalidMODEL_NOT_FOUND- Model doesn't existCONTEXT_LENGTH_EXCEEDED- Input too longNETWORK_ERROR- Network connection failedTIMEOUT- Request timed outTOOL_EXECUTION_ERROR- Tool execution failedBUDGET_EXCEEDED- Budget limit reachedUNKNOWN- Unclassified error
Environment Detection
The client automatically detects whether it's running in Node.js or a browser environment and adjusts its behavior accordingly:
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: process.env.ANTHROPIC_API_KEY,
// environment is auto-detected, but can be overridden
environment: 'node', // or 'browser'
});
console.log(client.environment); // 'node' or 'browser'Cost Tracking
The client automatically tracks costs for all providers based on current pricing:
const result = await client.query({ prompt: 'Test' });
console.log('Cost Breakdown:');
console.log(' Input: $', result.cost.inputCostUSD);
console.log(' Output: $', result.cost.outputCostUSD);
console.log(' Cache: $', result.cost.cacheCostUSD);
console.log(' Total: $', result.cost.totalCostUSD);
console.log(' Model: ', result.cost.model);Best Practices
1. Use Environment Variables for API Keys
// Good
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: process.env.ANTHROPIC_API_KEY,
});
// Bad - hardcoded API key
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: 'sk-ant-...',
});2. Configure Retry Logic for Production
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: process.env.ANTHROPIC_API_KEY,
retry: {
maxAttempts: 3,
initialDelayMs: 1000,
maxDelayMs: 10000,
},
});3. Set Budget Limits
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: process.env.ANTHROPIC_API_KEY,
budget: {
maxCostPerRequest: 0.50,
dailyBudget: 100.00,
},
});4. Use Event Hooks for Monitoring
const client = new LLMClient({
provider: 'anthropic',
model: 'claude-sonnet-4',
apiKey: process.env.ANTHROPIC_API_KEY,
onResponse: (response) => {
// Log to your monitoring system
metrics.recordLatency(response.latency);
metrics.recordCost(response.cost);
},
});5. Clear History for New Conversations
// Starting a new conversation topic
client.clearHistory();
await client.query({ prompt: 'New topic...' });Contributing
Contributions are welcome! Please see the Epic Flow contribution guidelines.
Development
# Install dependencies
yarn install
# Run tests
yarn test
# Run tests in watch mode
yarn test:watch
# Build the package
yarn build
# Type checking
yarn typecheck
# Linting
yarn lintLicense
MIT
Support
For issues, questions, or contributions, please refer to the Epic Flow project documentation.
