nexgen-sdk
v1.0.1
Published
A Next-Generation Node.js SDK for interacting with AI models, compatible with OpenAI's API format - VLLM Provider
Maintainers
Readme
NexGen SDK
A Next-Generation Node.js SDK for interacting with AI models, Compatible with OpenAI's API format - VLLM Provider
Features
- VLLM Provider: Full support for VLLM deployments
- Chat Completions: With system prompts, streaming, and tools support
- Text Completions: Standard text generation
- Embeddings: Text embedding generation
- Models: List and retrieve model information
- Streaming: Real-time response streaming with Server-Sent Events (SSE)
- Error Handling: Comprehensive error handling with specific exception types
- Retry Logic: Automatic retry with exponential backoff
- Type Safety: Plain JavaScript with clear error messages
Installation
npm install nexgen-sdkQuick Start
Basic Usage
const { AISDKClient } = require('nexgen-sdk');
// Initialize VLLM client
const client = new AISDKClient({
provider: 'vllm',
apiKey: 'your-api-key',
baseUrl: 'https://your-vllm-endpoint.com',
defaultModel: 'your-model-name'
});
// Create chat completion
const response = await client.chat().create(
'You are a helpful assistant.', // system prompt
[ // messages
{"role": "user", "content": {"type": "text", "text": "Hello!"}}
],
{
model: 'model_name',
temperature: 0.7
}
);
console.log(response.choices[0].message.content);
client.close();Factory Functions
const { createVLLMClient } = require('nexgen-sdk');
const client = createVLLMClient({
apiKey: 'your-api-key',
baseUrl: 'https://your-vllm-endpoint.com',
defaultModel: 'your-model-name'
});Builder Pattern
const { ClientBuilder } = require('nexgen-sdk');
const client = new ClientBuilder()
.withProvider('vllm')
.withApiKey('your-api-key')
.withBaseUrl('https://your-vllm-endpoint.com')
.withDefaultModel('your-model-name')
.withTimeout(60.0)
.build();API Reference
Chat Completions
// Regular chat completion
const response = await client.chat().create(
systemPrompt, // string | null - system prompt
messages, // array - message objects
options // object - additional options
);
// Streaming chat completion
const stream = await client.chat().create(
systemPrompt,
messages,
{ stream: true }
);
for await (const chunk of stream) {
console.log(chunk.choices[0].delta.content);
}Message Format
const messages = [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": {
"type": "text",
"text": "Hello! How can you help me?"
}
},
{
"role": "assistant",
"content": "I'm here to help you with anything you need!"
}
];Text Completions
const response = await client.completions().create({
model: 'model_name',
prompt: 'Once upon a time...',
temperature: 0.7,
max_tokens: 100
});
console.log(response.choices[0].text);Embeddings
const response = await client.embeddings().create({
model: 'model_name',
input: 'Hello, world!'
});
console.log(response.data[0].embedding);Models
// List models
const models = await client.models().list();
console.log(models.data);
// Get model info
const modelInfo = await client.models().retrieve('model_name');
console.log(modelInfo);Configuration Options
const client = new AISDKClient({
provider: 'vllm', // 'vllm' (currently supported)
apiKey: 'your-api-key', // Your API key/token
baseUrl: 'https://your-vllm-endpoint.com', // Your VLLM endpoint
timeout: 30.0, // Request timeout in seconds
maxRetries: 3, // Maximum retry attempts
retryDelay: 1.0, // Delay between retries in seconds
defaultModel: 'your-model-name', // Default model to use
model: 'your-model-name' // Model for this request
});Environment Variables
For better security and flexibility, use environment variables:
export AI_SDK_API_KEY="your-api-key"
export AI_SDK_BASE_URL="https://your-vllm-endpoint.com"
export AI_SDK_DEFAULT_MODEL="your-model-name"
export AI_SDK_TIMEOUT="30.0"const { Config } = require('nexgen-sdk');
const config = Config.fromEnvironment();
const client = new AISDKClient(config);Error Handling
The SDK provides specific exception types:
const {
AISDKException,
APIException,
BadRequestException,
AuthenticationException,
RateLimitException,
ServerException,
NetworkException,
TimeoutException
} = require('nexgen-sdk');
try {
const response = await client.chat().create(params);
} catch (error) {
if (error instanceof AuthenticationException) {
console.error('Authentication failed:', error.message);
} else if (error instanceof RateLimitException) {
console.error('Rate limit exceeded:', error.message);
} else if (error instanceof NetworkException) {
console.error('Network error:', error.message);
} else {
console.error('Unexpected error:', error.message);
}
}Examples
Run the included examples:
# Run all examples
npm start
# Run specific example
node examples/basic-usage.jsThe examples demonstrate:
- Basic chat completions
- Streaming responses
- Text completions
- Embeddings
- Model management
- Builder pattern
- Factory functions
VLLM Provider Configuration
This SDK supports any VLLM deployment with standard OpenAI-compatible endpoints:
- Chat Endpoint:
/v2/chat/completions - Text Endpoint:
/v1/completions - Embeddings Endpoint:
/v1/embeddings - Models Endpoint:
/v1/models
Users should configure their specific VLLM endpoint and model names.
Example Configurations
For Public VLLM Services
const client = new AISDKClient({
provider: 'vllm',
apiKey: 'your-api-key',
baseUrl: 'https://api.some-vllm-provider.com',
defaultModel: 'some-model-name'
});For Self-Hosted VLLM
const client = new AISDKClient({
provider: 'vllm',
apiKey: 'your-token',
baseUrl: 'http://localhost:8000',
defaultModel: 'local-model'
});Using Environment Variables
export AI_SDK_API_KEY="your-api-key"
export AI_SDK_BASE_URL="https://your-vllm-endpoint.com"
export AI_SDK_DEFAULT_MODEL="your-model-name"API Compatibility
This SDK maintains API compatibility with the Python version:
- Same method signatures
- Same parameter validation
- Same response format
- Same error handling approach
Requirements
- Node.js >= 14.0.0
- axios >= 1.6.0
License
MIT
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
Support
For issues and questions:
- Open an issue on GitHub
- Check the examples for common usage patterns
- Review the API documentation above
