npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

gemback

v0.7.1

Published

Smart Gemini API Fallback Library for Node.js & TypeScript

Readme

💎 Gem Back

Smart Gemini API Fallback Library with Multi-Key Rotation & Monitoring

npm version License: MIT TypeScript Tests

Gem Back is an NPM library that provides an intelligent fallback system and production-grade monitoring for Google Gemini API, automatically handling RPM (Requests Per Minute) rate limits.

한국어 문서 | Examples | Changelog


🎯 Why Gem Back?

The Gemini API has RPM (Requests Per Minute) limits on the free tier, causing 429 Too Many Requests errors in high-traffic scenarios. Gem Back solves this problem with:

Key Features ✨

  • Automatic Fallback: Seamlessly switches to alternate models when one fails
  • Smart Retry: Handles transient errors with Exponential Backoff
  • Multi-Key Rotation: Rotates through multiple API keys to bypass RPM limits
  • Streaming Support: Real-time response streaming (generateStream())
  • Conversational Interface: Multi-turn chat support (chat())
  • Statistics Tracking: Monitor usage and success rates per model/key
  • Zero Configuration: Works out of the box with sensible defaults
  • Full TypeScript Support: Complete type definitions and autocomplete
  • Dual Module Format: CommonJS + ESM support
  • Extensively Tested: 248 tests verify reliability
  • Monitoring & Tracking: Rate limit prediction and model health monitoring

🚀 Supported Models

Gem Back supports automatic fallback across Gemini models:

Default Fallback Chain (Optimized for Free Tier — v0.7.0, RPD-first):

  1. gemini-3.1-flash-lite — stable, 500 RPD (dominant daily quota)
  2. gemini-3.5-flash — newest, highest quality (20 RPD)
  3. gemini-3-flash-preview — backup (20 RPD) ⚠️

If output quality matters more than daily throughput, pass an explicit fallbackOrder putting gemini-3.5-flash first.

Free-Tier Quota Snapshot (2026-05-28):

| Model | RPM | TPM | RPD | Notes | |---|---|---|---|---| | gemini-3.1-flash-lite | 15 | 250K | 500 | | | gemini-2.5-flash-lite | 10 | 250K | 20 | ⚠️ deprecated (shutdown 2026-07-22 → gemini-3.1-flash-lite) | | gemini-3.5-flash | 5 | 250K | 20 | | | gemini-3-flash-preview | 5 | 250K | 20 | ⚠️ preview | | gemini-2.5-flash | 5 | 250K | 20 | ⚠️ deprecated (shutdown 2026-06-17 → gemini-3.1-flash-lite) |

Paid-Only Models (still in ALL_MODELS; runtime warning when used on free-tier keys):

  • gemini-3.1-pro-preview
  • gemini-2.5-pro
  • gemini-2.0-flash
  • gemini-2.0-flash-lite

Deprecation Warnings (v0.6.0+): Models scheduled for shutdown are automatically tracked. Enable logLevel: 'warn' to see deprecation warnings, or use the DEPRECATED_MODELS export for programmatic access.

import { DEPRECATED_MODELS } from 'gemback';

// Check which models are deprecated
DEPRECATED_MODELS.forEach(({ model, shutdownDate, replacement }) => {
  console.log(`${model} → ${replacement} (by ${shutdownDate})`);
});

Model Auto-Update System: The library includes automation scripts to keep the model list current with Google's API updates. See Contributing Guide for details on updating models.


📦 Installation

npm install gemback
# or
yarn add gemback
# or
pnpm add gemback

🔄 Migrating from v0.6 to v0.7

v0.7.0 includes one always-on breaking change and one conditional one. Most call sites need no update.

Confirmed breaking change

  • DeprecatedModelInfo.reason is now a string-literal union ('replaced_by_newer' | 'removed_from_api' | 'tier_change') instead of free-form string. If you read reason, switch to the union. The old prose strings on existing entries were moved to a new optional notes: string field.

    // Before (v0.6)
    const reason: string = info.reason; // e.g. "Gemini 2.0 series end of life"
    
    // After (v0.7)
    const reason: DeprecationReason = info.reason; // 'replaced_by_newer' | ...
    const detail: string | undefined = info.notes; // original prose, if any

Default fallback order changed

If you didn't pass fallbackOrder to GemBack, the default sequence now optimizes for daily RPD instead of model quality:

gemini-3.1-flash-lite → gemini-3.5-flash → gemini-3-flash-preview

To keep the v0.6 quality-first behavior, pass it explicitly:

new GemBack({
  apiKey: process.env.GEMINI_API_KEY,
  fallbackOrder: ['gemini-3.5-flash', 'gemini-3-flash-preview', 'gemini-3.1-flash-lite'],
});

New utilities you can adopt

  • REMOVED_MODELS and DEPRECATED_MODELS exports are the single source of truth for what was removed/replaced.
  • RateLimitStatus.currentTPM / maxTPM / tpmUtilizationPercent are now populated for free-tier models.
  • gemini-3.5-flash and stable gemini-3.1-flash-lite are available.

Paid-only models on free-tier keys

If you invoke gemini-2.5-pro, gemini-2.0-flash, gemini-2.0-flash-lite, or gemini-3.1-pro-preview on a free-tier API key, you'll see a one-time logger.warn explaining the 4xx you can expect. Either upgrade the key or pin fallbackOrder to free-tier models.


⚡ Quick Start

Basic Usage

import { GemBack } from 'gemback';

// Create client
const client = new GemBack({
  apiKey: process.env.GEMINI_API_KEY
});

// Generate text
const response = await client.generate('Hello, Gemini!');
console.log(response.text);
// Automatically selects the best model and handles fallback

Custom Fallback Order

const client = new GemBack({
  apiKey: process.env.GEMINI_API_KEY,
  fallbackOrder: [
    'gemini-3.5-flash',       // Optional: top-quality first if quality > daily throughput
    'gemini-3.1-flash-lite',  // Stable, highest free-tier RPD (500/day)
    'gemini-3-flash-preview', // Last-resort backup
  ],
  maxRetries: 3,
  timeout: 30000,
  debug: true, // Enable detailed logging
});

Streaming Response

const stream = client.generateStream('Tell me a long story');

for await (const chunk of stream) {
  process.stdout.write(chunk.text);
}

Multi-Key Rotation (New!)

Effectively bypass RPM limits by using multiple API keys:

const client = new GemBack({
  apiKeys: [
    process.env.GEMINI_API_KEY_1,
    process.env.GEMINI_API_KEY_2,
    process.env.GEMINI_API_KEY_3
  ],
  apiKeyRotationStrategy: 'round-robin' // or 'least-used'
});

// Automatically rotates through keys for each request
const response1 = await client.generate('First question'); // Uses key_1
const response2 = await client.generate('Second question'); // Uses key_2
const response3 = await client.generate('Third question'); // Uses key_3

// Check per-key statistics
const stats = client.getFallbackStats();
console.log(stats.apiKeyStats); // Usage and success rate per key

Rotation Strategies:

  • round-robin (default): Rotate through keys sequentially
  • least-used: Prioritize the least-used key

Monitoring & Tracking (New!)

Improve stability with real-time rate limit tracking and model health monitoring:

const client = new GemBack({
  apiKey: process.env.GEMINI_API_KEY,
  enableMonitoring: true  // Enable monitoring
});

// Use the API
await client.generate('Question 1');
await client.generate('Question 2');
// ...

// Get detailed monitoring statistics
const stats = client.getFallbackStats();

// Check rate limit status
console.log(stats.monitoring?.rateLimitStatus);
// [
//   {
//     model: 'gemini-2.5-flash',
//     currentRPM: 5,          // Current requests per minute
//     maxRPM: 15,             // Maximum RPM
//     utilizationPercent: 33, // Utilization percentage
//     isNearLimit: false,     // Near limit warning
//     willExceedSoon: false,  // Will exceed soon warning
//     windowStats: {
//       requestsInLastMinute: 5,
//       requestsInLast5Minutes: 12,
//       averageRPM: 2.4
//     }
//   }
// ]

// Check model health status
console.log(stats.monitoring?.modelHealth);
// [
//   {
//     model: 'gemini-2.5-flash',
//     status: 'healthy',           // healthy | degraded | unhealthy
//     successRate: 0.98,           // Success rate
//     averageResponseTime: 1234,   // Average response time (ms)
//     availability: 0.99,          // Availability
//     consecutiveFailures: 0,      // Consecutive failures
//     metrics: {
//       totalRequests: 100,
//       successfulRequests: 98,
//       failedRequests: 2,
//       p50ResponseTime: 1100,     // 50th percentile
//       p95ResponseTime: 1800,     // 95th percentile
//       p99ResponseTime: 2100      // 99th percentile
//     }
//   }
// ]

// Overall summary
console.log(stats.monitoring?.summary);
// {
//   healthyModels: 3,
//   degradedModels: 1,
//   unhealthyModels: 0,
//   overallSuccessRate: 0.96,
//   averageResponseTime: 1500
// }

Monitoring Features:

  • Rate Limit Tracking: Real-time RPM usage tracking per model
  • Predictive Warnings: Automatic warnings before hitting limits (80%, 90% thresholds)
  • Health Monitoring: Track success rate, response time, and availability per model
  • Percentile Metrics: Analyze p50, p95, p99 response times
  • Failure Detection: Automatic status detection (healthy/degraded/unhealthy)

📖 Core Features

1. Automatic Fallback

// Automatically falls back through the fallback chain
// when a model hits rate limit (default v0.7.0: gemini-3.1-flash-lite → gemini-3.5-flash → gemini-3-flash-preview)
const response = await client.generate('Complex question');

2. Retry Logic

const client = new GemBack({
  apiKey: 'YOUR_KEY',
  maxRetries: 3, // Max retries per model
  retryDelay: 1000 // Initial retry delay (ms)
});

3. Error Handling

try {
  const response = await client.generate('Hello');
} catch (error) {
  if (error instanceof GeminiBackError) {
    console.log('Models attempted:', error.allAttempts);
    console.log('Last error:', error.message);
  }
}

4. Statistics

const stats = client.getFallbackStats();
console.log(stats);
// {
//   totalRequests: 100,
//   successRate: 0.95,
//   failureCount: 5,
//   modelUsage: {
//     'gemini-3-flash-preview': 70,
//     'gemini-2.5-flash': 30
//   },
//   apiKeyStats: [  // Only in multi-key mode
//     {
//       keyIndex: 0,
//       totalRequests: 35,
//       successCount: 33,
//       failureCount: 2,
//       successRate: 0.94,
//       lastUsed: Date
//     },
//     // ... other keys
//   ],
//   monitoring: {  // Only when enableMonitoring: true
//     rateLimitStatus: [...],  // Rate limit status per model
//     modelHealth: [...],      // Health status per model
//     summary: {
//       healthyModels: 3,
//       degradedModels: 1,
//       unhealthyModels: 0,
//       overallSuccessRate: 0.96,
//       averageResponseTime: 1500
//     }
//   }
// }

5. System Instructions (v0.5.0+)

Control the model's behavior, personality, and response style:

// String format
const response = await client.generate('Explain TypeScript', {
  systemInstruction: 'You are a helpful programming tutor. Explain concepts clearly for beginners.',
});

// Structured Content format
const response2 = await client.generate('What is async/await?', {
  systemInstruction: {
    role: 'user',
    parts: [{ text: 'You are a senior engineer. Provide technical, detailed explanations.' }],
  },
});

// Works with all generation methods
const stream = client.generateStream('Explain promises', {
  systemInstruction: 'Keep explanations under 100 words. Use bullet points.',
});

const chatResponse = await client.chat(messages, {
  systemInstruction: 'You are a friendly coding mentor. Use analogies to explain.',
});

Use Cases:

  • Guide model personality and tone
  • Enforce output formatting requirements
  • Create role-based assistants (tutor, technical writer, etc.)
  • Maintain consistent behavior across conversations

6. Function Calling / Tool Use (v0.5.0+)

Enable the model to call external functions with structured parameters:

import type { FunctionDeclaration } from 'gemback';

// Define a function
const weatherFunction: FunctionDeclaration = {
  name: 'get_current_weather',
  description: 'Get the current weather in a given location',
  parameters: {
    type: 'object',
    properties: {
      location: {
        type: 'string',
        description: 'The city name, e.g. Tokyo, London',
      },
      unit: {
        type: 'string',
        enum: ['celsius', 'fahrenheit'],
      },
    },
    required: ['location'],
  },
};

// Use the function
const response = await client.generate("What's the weather in Tokyo?", {
  tools: [weatherFunction],
  toolConfig: {
    functionCallingMode: 'auto', // 'auto' | 'any' | 'none'
  },
});

// Check if model called the function
if (response.functionCalls && response.functionCalls.length > 0) {
  response.functionCalls.forEach((call) => {
    console.log('Function:', call.name);
    console.log('Arguments:', call.args);

    // Execute your actual function here
    const result = getCurrentWeather(call.args.location, call.args.unit);
    console.log('Result:', result);
  });
}

Function Calling Modes:

  • auto: Model decides when to call functions (default)
  • any: Force model to call at least one function
  • none: Disable function calling

Advanced Features:

// Restrict to specific functions
const response = await client.generate(prompt, {
  tools: [weatherFunction, calculatorFunction, databaseFunction],
  toolConfig: {
    functionCallingMode: 'any',
    allowedFunctionNames: ['get_current_weather'], // Only allow weather
  },
});

// Multi-turn conversation with function results
const followUpResponse = await client.generateContent([
  { role: 'user', parts: [{ text: "What's the weather?" }] },
  { role: 'model', parts: [{ functionCall: { name: 'get_current_weather', args: {...} } }] },
  { role: 'user', parts: [{ functionResponse: { name: 'get_current_weather', response: {...} } }] },
  { role: 'user', parts: [{ text: 'Should I bring an umbrella?' }] },
]);

Use Cases:

  • Integrate with external APIs and databases
  • Perform calculations and data processing
  • Access real-time information
  • Create structured workflows and automation
  • Build AI agents with tool access

7. Safety Settings (v0.5.0+)

Configure content filtering and safety thresholds for different harm categories:

import { HarmCategory, HarmBlockThreshold } from '@google/genai';

// Basic safety settings
const response = await client.generate('Tell me about content moderation', {
  safetySettings: [
    {
      category: HarmCategory.HARM_CATEGORY_HARASSMENT,
      threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
    },
    {
      category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
      threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
    },
  ],
});

// Strict filtering for children's content
const childContent = await client.generate('Tell a story for kids', {
  safetySettings: [
    {
      category: HarmCategory.HARM_CATEGORY_HARASSMENT,
      threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    },
    {
      category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
      threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    },
    {
      category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
      threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    },
    {
      category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
      threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    },
  ],
});

// Combine with other options
const response3 = await client.generate('Write an educational article', {
  systemInstruction: 'You are an educational content writer.',
  safetySettings: [
    {
      category: HarmCategory.HARM_CATEGORY_HARASSMENT,
      threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
    },
  ],
  temperature: 0.7,
});

Available Harm Categories:

  • HARM_CATEGORY_HARASSMENT
  • HARM_CATEGORY_HATE_SPEECH
  • HARM_CATEGORY_SEXUALLY_EXPLICIT
  • HARM_CATEGORY_DANGEROUS_CONTENT

Blocking Thresholds:

  • BLOCK_NONE: No blocking
  • BLOCK_ONLY_HIGH: Block only high severity content
  • BLOCK_MEDIUM_AND_ABOVE: Block medium and high severity (recommended)
  • BLOCK_LOW_AND_ABOVE: Block low, medium, and high severity (strictest)

Use Cases:

  • Child-safe content generation
  • Compliance with content policies
  • Brand-appropriate responses
  • Educational content filtering

8. JSON Mode (v0.5.0+)

Get structured JSON responses with schema validation:

import type { ResponseSchema } from 'gemback';

// Basic JSON mode
const response = await client.generate('Generate a user profile with name, age, and email', {
  responseMimeType: 'application/json',
});

console.log(response.json);  // Parsed JSON object
console.log(response.text);  // Raw JSON string

// JSON mode with schema validation
const userSchema: ResponseSchema = {
  type: 'object',
  properties: {
    name: { type: 'string' },
    age: { type: 'number' },
    email: { type: 'string' },
  },
  required: ['name', 'age', 'email'],
};

const response2 = await client.generate('Generate a user profile', {
  responseMimeType: 'application/json',
  responseSchema: userSchema,
});

// Type-safe usage
interface User {
  name: string;
  age: number;
  email: string;
}

const user = response2.json as User;
console.log(user.name, user.age, user.email);

// Array of objects
const productsSchema: ResponseSchema = {
  type: 'array',
  items: {
    type: 'object',
    properties: {
      id: { type: 'number' },
      name: { type: 'string' },
      price: { type: 'number' },
    },
    required: ['id', 'name', 'price'],
  },
};

const products = await client.generate('Generate 3 products', {
  responseMimeType: 'application/json',
  responseSchema: productsSchema,
});

// Complex nested structures
const blogPostSchema: ResponseSchema = {
  type: 'object',
  properties: {
    title: { type: 'string' },
    author: {
      type: 'object',
      properties: {
        name: { type: 'string' },
        email: { type: 'string' },
      },
    },
    tags: {
      type: 'array',
      items: { type: 'string' },
    },
  },
  required: ['title', 'author'],
};

Schema Types Supported:

  • object: Object with defined properties
  • array: Array of items
  • string, number, boolean, null: Primitive types

Use Cases:

  • API response formatting
  • Data extraction and structuring
  • Type-safe API integration
  • Structured content generation
  • Database-ready outputs

🔧 API Reference

GemBack

Constructor Options

import type { GeminiModel, RateLimitConfig } from 'gemback';

interface GemBackOptions {
  apiKey?: string;                   // Gemini API key (single key)
  apiKeys?: string[];                // Multiple API keys (multi-key mode)
  fallbackOrder?: GeminiModel[];     // Optional: Fallback order
  maxRetries?: number;               // Optional: Max retries (default: 2)
  timeout?: number;                  // Optional: Request timeout (default: 30000ms)
  retryDelay?: number;               // Optional: Initial retry delay (default: 1000ms)
  debug?: boolean;                   // Optional: Debug logging (default: false)
  logLevel?: 'debug' | 'info' | 'warn' | 'error' | 'silent';
  apiKeyRotationStrategy?: 'round-robin' | 'least-used'; // Key rotation strategy (default: round-robin)
  enableMonitoring?: boolean;        // Optional: Enable monitoring (default: false)
  enableRateLimitPrediction?: boolean; // Optional: Rate limit prediction warnings (default: false)
  customRateLimits?: Partial<Record<GeminiModel, Partial<RateLimitConfig>>>; // Optional: per-model
                                     // RPM/TPM/RPD overrides applied on top of
                                     // FREE_TIER_LIMITS defaults. Per-entry field merge,
                                     // so { 'gemini-2.5-flash': { rpm: 10 } } preserves
                                     // the existing tpm/rpd. Only consulted when
                                     // `enableMonitoring: true`.
}

Note: Either apiKey or apiKeys must be provided.

Methods

generate(prompt, options?)

Generate text response

const response = await client.generate('Hello!', {
  model: 'gemini-2.5-flash',  // Specify model
  temperature: 0.7,
  maxTokens: 1000,
  systemInstruction: 'You are a helpful assistant',  // v0.5.0+
  tools: [weatherFunction],  // v0.5.0+
  toolConfig: { functionCallingMode: 'auto' },  // v0.5.0+
  safetySettings: [{ category: HarmCategory.HARM_CATEGORY_HARASSMENT, threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE }],  // v0.5.0+
  responseMimeType: 'application/json',  // v0.5.0+
  responseSchema: { type: 'object', properties: { ... } }  // v0.5.0+
});

GenerateOptions:

interface GenerateOptions {
  model?: GeminiModel;
  temperature?: number;           // 0.0 - 2.0
  maxTokens?: number;            // Max output tokens
  topP?: number;                 // 0.0 - 1.0
  topK?: number;                 // Top-K sampling
  systemInstruction?: string | Content;  // v0.5.0+: Control model behavior
  tools?: FunctionDeclaration[];         // v0.5.0+: Available functions
  toolConfig?: ToolConfig;               // v0.5.0+: Function calling config
  safetySettings?: SafetySetting[];      // v0.5.0+: Content filtering
  responseMimeType?: string;             // v0.5.0+: Response format (e.g., 'application/json')
  responseSchema?: ResponseSchema;       // v0.5.0+: JSON schema validation
}

interface ToolConfig {
  functionCallingMode?: 'auto' | 'any' | 'none';
  allowedFunctionNames?: string[];
}
generateStream(prompt, options?)

Generate streaming response

const stream = client.generateStream('Tell me a story');
for await (const chunk of stream) {
  console.log(chunk.text);
}
chat(messages, options?)

Conversational interface

const response = await client.chat([
  { role: 'user', content: 'Hello' },
  { role: 'assistant', content: 'Hi! How can I help?' },
  { role: 'user', content: 'Tell me about TypeScript' }
]);
getFallbackStats()

Get fallback statistics

const stats = client.getFallbackStats();

⚙️ Configuration

Basic Configuration

const client = new GemBack({
  apiKey: 'YOUR_KEY',

  // Specify models to use
  fallbackOrder: [
    'gemini-3.1-flash-lite',
    'gemini-3.5-flash',
    'gemini-3-flash-preview',
  ],

  // Retry settings
  maxRetries: 3,
  retryDelay: 2000,

  // Timeout settings
  timeout: 60000,

  // Logging settings
  debug: true,
  logLevel: 'info'
});

Advanced Configuration (v0.2.0)

const client = new GemBack({
  // Multi-key rotation (v0.2.0+)
  apiKeys: ['KEY_1', 'KEY_2', 'KEY_3'],
  apiKeyRotationStrategy: 'least-used',  // or 'round-robin'

  // Monitoring & tracking (v0.2.0+)
  enableMonitoring: true,                // Enable monitoring
  enableRateLimitPrediction: true,       // Rate limit prediction warnings

  // Base settings
  fallbackOrder: ['gemini-3.1-flash-lite', 'gemini-3.5-flash', 'gemini-3-flash-preview'],
  maxRetries: 2,
  timeout: 30000,
  logLevel: 'info'
});

🔄 Fallback Behavior

Error Handling Scenarios

| Error Type | Handling | |-----------|-----------| | 429 RPM Limit | ⚡ Immediate fallback to next model | | 5xx Server Error | 🔄 Retry then fallback | | Timeout | 🔄 Retry then fallback | | 401/403 Auth Error | ❌ Immediate failure (stop fallback) | | All Models Failed | ❌ Return detailed error info |

Retry Strategy

  • Exponential Backoff: 1s → 2s → 4s → ...
  • Retryable Errors: 5xx, Timeout, Network Error
  • Non-retryable Errors: 4xx (except 429), Auth errors

📊 Logging Examples

Basic Logging (debug: true)

[GemBack] Attempting: gemini-3-flash-preview
[GemBack] Failed (429 RPM Limit): gemini-3-flash-preview
[GemBack] Fallback to: gemini-2.5-flash
[GemBack] Retry attempt 1/2: gemini-2.5-flash
[GemBack] Success: gemini-2.5-flash (2nd attempt)

With Monitoring Enabled (enableMonitoring: true)

[GemBack] Monitoring enabled: Rate limit tracking and health monitoring
[GemBack] Attempting: gemini-2.5-flash (API Key #1)
[GemBack] Rate limit warning for gemini-2.5-flash: 12/15 RPM
[GemBack] Success: gemini-2.5-flash (1234ms)

🗺️ Roadmap

Phase 1: Core Features ✅ (Completed - v0.1.0)

  • [x] Project structure
  • [x] Basic fallback logic
  • [x] 4 model support
  • [x] TypeScript type definitions
  • [x] Automatic retry with Exponential Backoff
  • [x] Streaming response support
  • [x] Conversational interface (chat)
  • [x] Statistics tracking
  • [x] Comprehensive test coverage (100 tests)
  • [x] Complete documentation and examples

Phase 2: Advanced Features ✅ (Completed - v0.2.0)

Phase 2 added advanced features to improve production stability.

🔐 Multi-Key Support & Rotation ✅

  • [x] Load balancing with multiple API keys
    • Automatic key rotation to bypass RPM limits
    • Support for round-robin and least-used strategies
    • Per-key usage tracking and statistics

📊 Monitoring & Tracking ✅

  • [x] Rate Limit Tracking & Prediction

    • Real-time usage tracking per model
    • Predictive warnings before hitting limits (80%, 90% thresholds)
    • Sliding window analysis (1-minute, 5-minute)
  • [x] Health Check & Model Status Monitoring

    • Status monitoring per model (response time, success rate, availability)
    • Real-time health status (healthy/degraded/unhealthy)
    • Percentile-based performance metrics (p50, p95, p99)
    • Consecutive failure detection and tracking

Phase 2 Achievements:

  • ✅ 165 comprehensive tests (65% increase from Phase 1)
  • ✅ Production-level monitoring system
  • ✅ Multi-key rotation for RPM limit bypass
  • ✅ Real-time model health tracking

Phase 2.5: Advanced Content Generation ✅ (Completed - v0.5.0)

Phase 2.5 adds production-grade content generation features from the Google GenAI SDK, including function calling, system instructions, safety controls, and structured outputs.

🎯 System Instructions ✅

  • [x] Control model behavior and response style
    • Guide model personality, tone, and output format
    • Support both string and structured Content format
    • Apply instructions across all generation methods
    • Maintain instructions through fallback chains

🔧 Function Calling (Tool Use) ✅

  • [x] Enable AI to call external functions
    • Define functions with structured parameters (JSON Schema)
    • Multiple function calling modes: auto, any, none
    • Restrict allowed functions with allowedFunctionNames
    • Extract function calls from model responses
    • Support multi-turn conversations with function results

🛡️ Safety Settings ✅

  • [x] Content filtering and moderation
    • Configure safety thresholds for different harm categories
    • Support for harassment, hate speech, sexually explicit, and dangerous content filtering
    • Multiple blocking levels: none, low, medium, high
    • Child-safe content generation
    • Compliance with content policies

📊 JSON Mode ✅

  • [x] Structured JSON responses
    • Automatic JSON parsing with response.json field
    • Schema validation with OpenAPI-compatible schemas
    • Support for objects, arrays, and nested structures
    • Type-safe integration with TypeScript interfaces
    • Structured data extraction and API response formatting

Phase 2.5 Achievements:

  • ✅ 248 comprehensive tests
  • ✅ Full GenAI SDK compatibility for all advanced features
  • ✅ Production-ready content safety controls
  • ✅ Type-safe structured outputs with schema validation
  • ✅ Comprehensive examples for all features

Phase 3: Performance & Ecosystem (Planned)

Phase 3 will focus on performance optimization and ecosystem expansion.

⚡ Performance Optimization

  • [ ] Response Caching

    • Reduce API calls with caching
    • TTL-based cache expiration
    • Memory-efficient cache strategy
  • [ ] Connection Pooling

    • Improve performance with connection reuse
    • Optimize concurrent request handling
    • Efficient resource usage

🛡️ Advanced Reliability Patterns

  • [ ] Circuit Breaker Pattern
    • Temporary blocking on persistent failures
    • Automatic recovery and retry
    • System overload prevention

🌐 Ecosystem Expansion

  • [ ] CLI tools
  • [ ] Web dashboard (real-time monitoring)
  • [ ] Monitoring integration (Prometheus, Grafana)
  • [ ] Additional AI model support (Claude, GPT, etc.)

🤝 Contributing

Contributions are welcome! You can participate by:

  1. Reporting issues
  2. Suggesting features
  3. Submitting pull requests
  4. Improving documentation

See CONTRIBUTING.md for details.


📄 License

MIT License - Free to use, modify, and distribute.


🔗 Links


💡 FAQ

Q: Where can I get an API key?

A: Get a free API key at Google AI Studio.

Q: What happens when all models fail?

A: Throws GeminiBackError with details of all attempts.

Q: Can I use only specific models?

A: Yes, pass your preferred models in the fallbackOrder option.

Q: What are the costs?

A: Only Gemini API costs apply. Gem Back is free and open-source.


🌟 Projects Using Gem Back

Be the first to showcase your project using Gem Back!

If you're using Gem Back in your project, we'd love to feature it here. Your project could be the first one listed!

Updated: 2025-11-29


Made with ❤️ by Laeyoung