@ddse/acm-llm
v0.5.0
Published
ACM v0.5 LLM - Provider-agnostic LLM client
Readme
@ddse/acm-llm
OpenAI-compatible LLM client with streaming support for Ollama and vLLM.
Overview
The LLM package provides a unified client interface for local LLM providers using the OpenAI-compatible API format. It supports both standard request/response and streaming modes.
Installation
pnpm add @ddse/acm-llm @ddse/acm-sdkFeatures
- ✅ OpenAI-compatible API interface
- ✅ Streaming support
- ✅ Works with Ollama and vLLM out of the box
- ✅ Configurable base URLs
- ✅ Optional API key support
- ✅ Zero external dependencies
Usage
Basic Usage with Ollama
import { createOllamaClient } from '@ddse/acm-llm';
const client = createOllamaClient('llama3.1');
const response = await client.generate([
{ role: 'user', content: 'What is 2+2?' }
], {
temperature: 0.7,
maxTokens: 100,
});
console.log(response.text);Streaming
import { createOllamaClient } from '@ddse/acm-llm';
const client = createOllamaClient('llama3.1');
for await (const chunk of client.generateStream([
{ role: 'user', content: 'Count to 10' }
])) {
if (!chunk.done) {
process.stdout.write(chunk.delta);
}
}Using vLLM
import { createVLLMClient } from '@ddse/acm-llm';
const client = createVLLMClient('qwen2.5:7b');
const response = await client.generate([
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Hello!' }
]);Custom Configuration
import { OpenAICompatClient } from '@ddse/acm-llm';
const client = new OpenAICompatClient({
baseUrl: 'http://localhost:8000/v1',
apiKey: 'optional-key',
model: 'my-model',
name: 'my-provider',
});With ACM Planner
import { createOllamaClient } from '@ddse/acm-llm';
import { StructuredLLMPlanner } from '@ddse/acm-planner';
const llm = createOllamaClient('llama3.1');
const planner = new StructuredLLMPlanner();
const { plans } = await planner.plan({
goal: { id: 'g1', intent: 'Process order' },
context: { id: 'ctx1', facts: { orderId: 'O123' } },
capabilities: [{ name: 'search' }, { name: 'process' }],
llm,
});API Reference
OpenAICompatClient
Constructor:
new OpenAICompatClient({
baseUrl: string;
apiKey?: string;
model: string;
name: string;
})Methods:
name(): string
Returns the provider name.
generate(messages, opts?): Promise
Generate a completion.
Parameters:
messages: ChatMessage[]- Array of chat messagesopts?: { temperature?, seed?, maxTokens? }- Optional generation options
Returns: Promise<LLMResponse>
{
text: string;
tokens?: number;
raw?: any;
}generateStream(messages, opts?): AsyncIterableIterator
Generate a streaming completion.
Parameters:
- Same as
generate()
Yields: LLMStreamChunk
{
delta: string;
done: boolean;
}Helper Functions
createOllamaClient(model, baseUrl?)
Create a client for Ollama.
Defaults:
- baseUrl:
http://localhost:11434/v1
createVLLMClient(model, baseUrl?)
Create a client for vLLM.
Defaults:
- baseUrl:
http://localhost:8000/v1
Types
ChatMessage
type ChatMessage = {
role: 'system' | 'user' | 'assistant';
content: string;
};LLMResponse
type LLMResponse = {
text: string;
tokens?: number;
raw?: any;
};LLMStreamChunk
type LLMStreamChunk = {
delta: string;
done: boolean;
};Provider Setup
Ollama
- Install Ollama from https://ollama.ai
- Pull a model:
ollama pull llama3.1 - Start server:
ollama serve - Default endpoint: http://localhost:11434/v1
vLLM
- Install vLLM:
pip install vllm - Start server:
vllm serve <model-name> --port 8000 - Default endpoint: http://localhost:8000/v1
Error Handling
The client throws errors for:
- Network failures
- Invalid responses
- Non-2xx status codes
try {
const response = await client.generate([...]);
} catch (error) {
console.error('LLM error:', error.message);
}Performance Tips
- Use streaming for long responses to improve UX
- Set appropriate
maxTokenslimits - Use
temperature: 0for deterministic outputs - Set
seedfor reproducible generation
License
Apache-2.0
