@pioneer-platform/pioneer-inference
v1.1.1
Published
OpenAI-compatible inference proxy for Pioneer platform
Maintainers
Readme
Pioneer Inference
OpenAI-compatible inference proxy for the Pioneer platform. This module provides a secure way to expose AI inference capabilities to your apps without exposing API keys.
Features
- API Key Protection: Keep your AI provider API keys secure on the server
- System Prompt Injection: Automatically inject system prompts to guide model behavior
- Multi-Provider Support: Works with OpenAI, OpenRouter, and Venice.ai
- OpenAI-Compatible API: Drop-in replacement for OpenAI client libraries
Installation
This is a workspace package in the Pioneer monorepo. It's automatically available to other packages via:
import { InferenceService, createInferenceServiceFromEnv } from '@pioneer-platform/pioneer-inference';Configuration
Set the following environment variables:
# Provider selection (openai, openrouter, or venice)
INFERENCE_PROVIDER=openai
# API key for the selected provider
INFERENCE_API_KEY=your-api-key-here
# Or use the standard OpenAI key
OPENAI_API_KEY=your-api-key-here
# Optional: Custom base URL (for self-hosted or proxy endpoints)
INFERENCE_BASE_URL=https://api.openai.com/v1
# Optional: System prompt to inject in all requests
INFERENCE_SYSTEM_PROMPT="You are a helpful cryptocurrency wallet assistant."
# Optional: Default model to use
INFERENCE_DEFAULT_MODEL=gpt-4-turbo-previewProvider Configuration
OpenAI
INFERENCE_PROVIDER=openai
INFERENCE_API_KEY=sk-...
# Default model: gpt-4-turbo-previewOpenRouter
INFERENCE_PROVIDER=openrouter
INFERENCE_API_KEY=sk-or-...
INFERENCE_BASE_URL=https://openrouter.ai/api/v1
# Default model: anthropic/claude-3-opusVenice.ai
INFERENCE_PROVIDER=venice
INFERENCE_API_KEY=your-venice-key
INFERENCE_BASE_URL=https://api.venice.ai/api/v1
# Default model: llama-3.1-405bUsage
REST API Endpoints
The Pioneer server exposes OpenAI-compatible endpoints at /v1:
Create Chat Completion
POST http://localhost:9001/v1/chat/completions
Content-Type: application/json
{
"model": "gpt-4-turbo-preview",
"messages": [
{
"role": "user",
"content": "What is Bitcoin?"
}
],
"temperature": 0.7,
"max_tokens": 150
}List Available Models
GET http://localhost:9001/v1/modelsGet Provider Info
GET http://localhost:9001/v1/providerResponse:
{
"provider": "openai",
"hasSystemPrompt": true,
"configured": true
}Using in TypeScript
import { InferenceService } from '@pioneer-platform/pioneer-inference';
// Create service with custom configuration
const service = new InferenceService({
provider: 'openai',
apiKey: 'sk-...',
systemPrompt: 'You are a crypto assistant.',
defaultModel: 'gpt-4-turbo-preview'
});
// Create chat completion
const response = await service.createChatCompletion({
model: 'gpt-4-turbo-preview',
messages: [
{ role: 'user', content: 'Explain blockchain' }
],
temperature: 0.7,
max_tokens: 150
});
console.log(response.choices[0].message.content);Using from Browser/Frontend
Your frontend apps can call the Pioneer server endpoints without exposing API keys:
// Using OpenAI client library with Pioneer server as base URL
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'not-needed', // Server handles authentication
baseURL: 'http://localhost:9001/v1',
dangerouslyAllowBrowser: true // Only because we're proxying
});
const completion = await client.chat.completions.create({
model: 'gpt-4-turbo-preview',
messages: [
{ role: 'user', content: 'What is Ethereum?' }
]
});
console.log(completion.choices[0].message.content);Or using fetch directly:
const response = await fetch('http://localhost:9001/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
model: 'gpt-4-turbo-preview',
messages: [
{ role: 'user', content: 'What is Ethereum?' }
]
})
});
const data = await response.json();
console.log(data.choices[0].message.content);System Prompt Injection
The service automatically injects a system prompt if:
- A system prompt is configured via
INFERENCE_SYSTEM_PROMPTor in the config - The messages array doesn't already contain a system message
This ensures consistent model behavior across all requests without requiring clients to specify the system prompt.
Security Considerations
- Never expose your API keys to the frontend - always use the server proxy
- The Pioneer server should be configured with appropriate CORS settings
- Consider adding authentication to the inference endpoints for production use
- Rate limiting should be implemented to prevent API abuse
API Compatibility
This module implements the OpenAI Chat Completions API specification, making it compatible with:
- OpenAI's official client libraries
- Any tool or library that supports OpenAI-compatible APIs
- LangChain, LlamaIndex, and other AI frameworks
Development
Build the module:
cd modules/pioneer/pioneer-inference
bun run buildWatch for changes:
bun run build:watchLicense
MIT
