@polargrid/polargrid-sdk

v0.4.0

Published

16 days ago

JavaScript/TypeScript SDK for PolarGrid Edge AI Infrastructure with Full API Support

Downloads

151

0High
0Medium
0Low

dylanpulverpolargrid

polarsev

ai ml edge inference polargrid machine-learning typescript llm voice tts stt whisper

PolarGrid SDK

The official JavaScript/TypeScript SDK for PolarGrid Edge AI Infrastructure with full API support and mock data capabilities.

Features

✅ Text Inference: Completions and chat completions (streaming support)
✅ Voice: Text-to-speech and speech-to-text
✅ Model Management: Load, unload, and check model status
✅ GPU Management: Monitor and manage GPU resources
✅ Mock Data Mode: Develop without backend (perfect for frontend work)
✅ Full TypeScript Support: Complete type definitions
✅ Error Handling: Comprehensive error types
✅ Retry Logic: Automatic retry with exponential backoff

Installation

npm install @polargrid/polargrid-sdk

Quick Start

With Mock Data (Development)

Perfect for frontend development before backend is ready:

import { PolarGrid } from '@polargrid/polargrid-sdk';

const client = new PolarGrid({
  useMockData: true,  // Enable mock mode
  debug: true,        // See what's happening
});

// All methods work with realistic mock data
const response = await client.chatCompletion({
  model: 'llama-3.1-8b',
  messages: [
    { role: 'user', content: 'Hello!' }
  ]
});

console.log(response.choices[0].message.content);

With Real API (Production)

import { PolarGrid } from '@polargrid/polargrid-sdk';

const client = new PolarGrid({
  apiKey: 'pg_your_api_key',
  useMockData: false,  // Use real API
});

API Reference

Text Inference

Chat Completions

// Non-streaming
const response = await client.chatCompletion({
  model: 'llama-3.1-8b',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is quantum computing?' }
  ],
  maxTokens: 150,
  temperature: 0.7,
});

console.log(response.choices[0].message.content);

// Streaming
for await (const chunk of client.chatCompletionStream({
  model: 'llama-3.1-8b',
  messages: [{ role: 'user', content: 'Tell me a story' }],
})) {
  if (chunk.choices[0].delta.content) {
    process.stdout.write(chunk.choices[0].delta.content);
  }
}

Text Completions

const response = await client.completion({
  prompt: 'Once upon a time',
  model: 'llama-3.1-8b',
  maxTokens: 100,
  temperature: 0.8,
});

console.log(response.choices[0].text);

Voice / Audio

Text-to-Speech

// Generate audio
const audioBuffer = await client.textToSpeech({
  model: 'tts-1',
  input: 'Hello from PolarGrid!',
  voice: 'alloy',
  responseFormat: 'mp3',
  speed: 1.0,
});

// Save to file (Node.js)
import { writeFile } from 'fs/promises';
await writeFile('speech.mp3', Buffer.from(audioBuffer));

// Streaming TTS
for await (const chunk of client.textToSpeechStream({
  model: 'tts-1',
  input: 'Long text to convert...',
  voice: 'nova',
})) {
  // Process audio chunks as they arrive
  audioStream.write(chunk);
}

Speech-to-Text

// Transcribe audio
const file = new File([audioData], 'recording.mp3', { type: 'audio/mpeg' });

const transcription = await client.transcribe({
  file,
  model: 'whisper-1',
  language: 'en',
  responseFormat: 'json',
});

console.log(transcription.text);

// Verbose transcription with timestamps
const verbose = await client.transcribe({
  file,
  model: 'whisper-1',
  responseFormat: 'verbose_json',
}) as VerboseTranscriptionResponse;

verbose.segments.forEach(segment => {
  console.log(`[${segment.start}s - ${segment.end}s]: ${segment.text}`);
});

// Translate to English
const translation = await client.translate({
  file: spanishAudioFile,
  model: 'whisper-1',
  responseFormat: 'json',
});

console.log(translation.text);

Model Management

// List available models
const { data: models } = await client.listModels();
models.forEach(model => {
  console.log(`${model.id} (${model.ownedBy})`);
});

// Load a model
const result = await client.loadModel({
  modelName: 'llama-3.1-70b',
  forceReload: false,
});

console.log(result.message);

// Check model status
const status = await client.getModelStatus();
console.log('Loaded models:', status.loaded);
console.log('Loading status:', status.loadingStatus);

// Unload a model
await client.unloadModel({ modelName: 'gpt2' });

// Unload all models
const result = await client.unloadAllModels();
console.log(`Unloaded ${result.totalUnloaded} models`);

GPU Management

// Get detailed GPU status
const gpuStatus = await client.getGPUStatus();
gpuStatus.gpus.forEach(gpu => {
  console.log(`GPU ${gpu.index}: ${gpu.name}`);
  console.log(`  Memory: ${gpu.memory.usedGb}GB / ${gpu.memory.totalGb}GB`);
  console.log(`  Utilization: ${gpu.utilization.gpuPercent}%`);
  console.log(`  Temperature: ${gpu.temperatureC}°C`);
});

// Get simplified memory info
const memory = await client.getGPUMemory();
console.log(`Memory used: ${memory.memory[0].usedGb}GB (${memory.memory[0].percentUsed}%)`);

// Purge GPU memory
const purgeResult = await client.purgeGPU({ force: false });
console.log(`Freed ${purgeResult.memoryFreedGb}GB`);
console.log(`Unloaded models:`, purgeResult.modelsUnloaded);
console.log(purgeResult.recommendation);

Health Check

const health = await client.health();
console.log(`Status: ${health.status}`);
console.log(`Backend healthy: ${health.backend.healthy}`);
console.log(`Models loaded: ${health.backend.info.modelsLoaded}`);

Error Handling

import {
  PolarGrid,
  isPolarGridError,
  AuthenticationError,
  ValidationError,
  RateLimitError,
  ServerError,
  NetworkError,
} from '@polargrid/polargrid-sdk';

try {
  const response = await client.chatCompletion({
    model: 'llama-3.1-8b',
    messages: [{ role: 'user', content: 'Hello' }],
  });
} catch (error) {
  if (isPolarGridError(error)) {
    console.error(`PolarGrid Error: ${error.message}`);
    console.error(`Request ID: ${error.requestId}`);

    if (error instanceof AuthenticationError) {
      // Handle auth errors
    } else if (error instanceof ValidationError) {
      // Handle validation errors
      console.error('Details:', error.details);
    } else if (error instanceof RateLimitError) {
      // Handle rate limits
      console.error(`Retry after: ${error.retryAfter}s`);
    }
  }
}

Configuration Options

const client = new PolarGrid({
  // API key (required for production, optional for mock mode)
  apiKey: 'pg_your_api_key',

  // Base URL (default: https://api.polargrid.ai)
  baseUrl: 'https://api.polargrid.ai',

  // JWT token exchange URL (default: /api/auth/inference-token)
  authUrl: '/api/auth/inference-token',

  // Request timeout in milliseconds (default: 30000)
  timeout: 30000,

  // Maximum retry attempts (default: 3)
  maxRetries: 3,

  // Enable debug logging (default: false)
  debug: true,

  // Use mock data instead of real API (default: false)
  useMockData: true,
});

Mock Data for Development

The SDK includes comprehensive mock data that matches the API spec exactly:

Why Use Mock Data?

Frontend Development: Build UI components before backend is ready
Testing: Predictable responses for unit tests
Demos: Show realistic flows without production infrastructure
Development: Faster iteration without API calls

What's Mocked?

✅ All text inference endpoints with realistic responses
✅ Voice TTS and STT with proper audio formats
✅ Model management with state simulation
✅ GPU metrics with realistic utilization data
✅ Streaming responses (both text and audio)
✅ Error scenarios (configurable)

Mock Data Examples

// Mock mode returns realistic data instantly
const client = new PolarGrid({ useMockData: true });

// Chat completions understand context
const response = await client.chatCompletion({
  model: 'llama-3.1-8b',
  messages: [
    { role: 'user', content: 'What is the capital of France?' }
  ]
});
// Returns: "The capital of France is Paris..."

// Streaming works chunk by chunk
for await (const chunk of client.chatCompletionStream({
  model: 'llama-3.1-8b',
  messages: [{ role: 'user', content: 'Hello' }]
})) {
  // Each chunk arrives with realistic timing
}

// GPU status returns current metrics
const gpu = await client.getGPUStatus();
console.log(gpu.gpus[0].memory.usedGb); // e.g., 45.2 GB

Environment Variables

# API Key
POLARGRID_API_KEY=pg_your_api_key

# Base URL (optional)
NEXT_PUBLIC_INFERENCE_URL=https://api.polargrid.ai

TypeScript Support

Full TypeScript support with comprehensive type definitions:

import type {
  ChatCompletionRequest,
  ChatCompletionResponse,
  ModelInfo,
  GPUStatusResponse,
} from '@polargrid/polargrid-sdk';

Best Practices

1. Use Mock Data During Development

const isDevelopment = process.env.NODE_ENV === 'development';

const client = new PolarGrid({
  apiKey: process.env.POLARGRID_API_KEY,
  useMockData: isDevelopment,
  debug: isDevelopment,
});

2. Handle Errors Gracefully

try {
  const response = await client.chatCompletion(request);
  return response;
} catch (error) {
  if (error instanceof RateLimitError) {
    // Wait and retry
    await sleep(error.retryAfter * 1000);
    return client.chatCompletion(request);
  }
  throw error;
}

3. Use Streaming for Long Responses

// Better user experience for long-form content
for await (const chunk of client.chatCompletionStream(request)) {
  updateUI(chunk.choices[0].delta.content);
}

Examples

See the /examples directory for complete working examples:

examples/basic-chat.ts - Simple chat completion
examples/streaming.ts - Streaming responses
examples/voice.ts - TTS and STT examples
examples/model-management.ts - Loading and managing models
examples/gpu-monitoring.ts - GPU metrics dashboard

Development

# Install dependencies
npm install

# Build
npm run build

# Run tests
npm test

# Run tests in watch mode
npm run test:watch

# Type checking
npm run typecheck

License

MIT

Support

Documentation: https://docs.polargrid.ai
Issues: https://github.com/your-org/polargrid-sdk/issues
Email: [email protected]
Made with ❄️ by the PolarGrid team.