swarm-gem
v1.0.0
Published
Google Gemini API Key Swarm Manager - Automatically rotate multiple API keys to handle rate limits
Maintainers
Readme
🐝 SwarmGem
Google Gemini API Key Swarm Manager - Automatically rotate multiple API keys to handle rate limits and maximize throughput.
⚠️ Important Warning
CRITICAL: Gemini free tier has EXTREMELY LOW rate limits:
gemini-2.5-flash: Only 5 requests per minute per API keygemini-3-flash: Only 5 requests per minute per API key
To handle 10 requests/second, you would need at least 120 API keys!
This package is suitable for:
- ✅ Burst traffic (short periods of high load)
- ✅ Moderate usage (< 5 req/min with few keys)
- ✅ Development and testing
- ❌ NOT suitable for sustained high throughput with free tier
For production high-throughput applications, consider upgrading to paid tier with higher limits.
🚀 Quick Start
Installation
npm install swarm-gemSetup Environment Variables
# .env
GEMINI_SWARM_KEYS="key1,key2,key3,key4,key5"
GEMINI_MODEL="gemini-2.5-flash"Get your API keys from: https://aistudio.google.com/app/apikey
Basic Usage
import SwarmGem from 'swarm-gem'
// Initialize from environment variables
const sg = SwarmGem.init()
// Generate content
const response = await sg.generate("Hello, how are you?")
console.log(response)📖 Usage Examples
With Options
import SwarmGem from 'swarm-gem'
const sg = SwarmGem.init()
const response = await sg.generate("Write a poem about the ocean", {
temperature: 0.9, // Higher creativity
maxTokens: 1000 // Limit response length
})
console.log(response)Manual Configuration
import SwarmGem from 'swarm-gem'
const sg = SwarmGem.init({
keys: ['key1', 'key2', 'key3'],
model: 'gemini-3-flash',
rateLimitsPath: './custom-rate-limits.json' // Optional
})
const response = await sg.generate("Explain quantum computing")
console.log(response)Error Handling
import SwarmGem from 'swarm-gem'
const sg = SwarmGem.init()
try {
const response = await sg.generate("Your prompt here")
console.log(response)
} catch (error) {
if (error.message.includes('exhausted')) {
console.error('All API keys hit rate limit. Please wait or add more keys.')
} else if (error.message.includes('permanently failed')) {
console.error('Some API keys are invalid. Check your keys.')
} else {
console.error('API error:', error.message)
}
}⚙️ Configuration
Environment Variables
| Variable | Description | Required | Default |
|----------|-------------|----------|---------|
| GEMINI_SWARM_KEYS | Comma-separated API keys | Yes | - |
| GEMINI_MODEL | Model name to use | No | gemini-2.5-flash |
Supported Models
Check config/rate-limits.json for all supported models:
| Model | RPM | TPM | RPD | Use Case |
|-------|-----|-----|-----|----------|
| gemini-2.5-flash-lite | 10 | 250K | 20 | Fastest, lightweight tasks |
| gemini-2.5-flash | 5 | 250K | 20 | Fast, general purpose |
| gemini-3-flash | 5 | 250K | 20 | Latest fast model |
| gemini-2.5-flash-tts | 3 | 10K | 10 | Text-to-speech |
| gemini-2.5-flash-native-audio-dialog | Unlimited | 1M | Unlimited | Live audio API |
RPM = Requests Per Minute | TPM = Tokens Per Minute | RPD = Requests Per Day
🔧 Custom Rate Limits
You can customize rate limits by providing your own configuration file:
1. Create Custom Configuration
[
{
"model": "gemini-2.5-flash",
"category": "Text-out models",
"rpm": "0 / 5",
"tpm": "0 / 250K",
"rpd": "0 / 20"
}
]Format Notes:
rpm: "current / limit" (e.g., "0 / 10")- Supports suffixes:
K(thousands),M(millions) - Use
Unlimitedfor no limit
2. Use Custom Configuration
const sg = SwarmGem.init({
keys: ['key1', 'key2'],
model: 'gemini-2.5-flash',
rateLimitsPath: './my-rate-limits.json'
})📊 How It Works
Key Rotation Logic
┌─────────────────────────────────────┐
│ Request comes in │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Pick available key │
│ - Not in cooldown │
│ - Within rate limits │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Make API call │
└──────────────┬──────────────────────┘
│
┌──────┴──────┐
│ │
▼ ▼
Success Error (429/401/403)
│ │
▼ ▼
Mark used Mark failed
Return Try next keyRate Limit Tracking
- Per-Key Tracking: Each key tracks its own usage counters
- Time Windows:
- Minute counter resets after 60 seconds
- Day counter resets after 24 hours
- Cooldown: Failed keys go into cooldown:
- 429 (Rate Limit): 60 seconds (or retry-after header)
- 401/403 (Auth): Permanent (Infinity)
Automatic Retry
When a key hits rate limit:
- Mark key as failed with cooldown time
- Automatically pick next available key
- Retry request with new key
- Continue until success or all keys exhausted
🧮 Calculating Required Keys
To determine how many keys you need:
Required Keys = (Requests per second × 60) / Rate limit per minuteExamples
Scenario 1: 1 request/second
(1 req/s × 60) / 5 RPM = 12 keys minimumScenario 2: 10 requests/second
(10 req/s × 60) / 5 RPM = 120 keys minimumScenario 3: Burst of 50 requests over 10 seconds
Average: 5 req/s × 60 = 300 req/min
300 / 5 RPM = 60 keys minimumAdd 20-30% buffer for safety and account for retry delays.
🔍 Monitoring & Debugging
Check Available Keys
import { SwarmKeyManager } from 'swarm-gem'
// After initialization, logs show:
// SwarmGem initialized: 5 keys, model: gemini-2.5-flash, available: 5/5Console Output
The package automatically logs warnings for:
- Rate limits hit
- Authentication failures
- Key cooldowns
Example output:
Rate limit hit on key AIzaSyAbc1... (retry after 60s). Attempting next key...
API key AIzaSyDef2... marked as permanently failed. Key is invalid or lacks permissions.
SwarmGem initialized: 5 keys, model: gemini-2.5-flash, available: 5/5🛡️ Best Practices
1. Key Management
- Rotate Keys: Regularly rotate your API keys for security
- Separate Projects: Use different key pools for different projects
- Monitor Usage: Track which keys are hitting limits frequently
- Invalid Keys: The package automatically detects and disables invalid keys
2. Error Handling
Always wrap API calls in try-catch:
try {
const response = await sg.generate(prompt)
return response
} catch (error) {
// Log error
// Implement fallback logic
// Alert monitoring system
}3. Rate Limit Strategy
- Burst Traffic: Keep extra keys as buffer (20-30% more than calculated)
- Sustained Load: Consider upgrading to paid tier
- Off-Peak: Batch requests during off-peak hours when possible
4. Request Optimization
- Batch Prompts: Combine multiple questions into one request when possible
- Cache Responses: Cache common queries to reduce API calls
- Limit Tokens: Set reasonable
maxTokensto avoid hitting TPM limits
🧪 Testing
Run Tests
npm testExample Test
import { describe, it, expect } from 'vitest'
import { parseRateLimit } from './src/utils.js'
describe('parseRateLimit', () => {
it('should parse basic numbers', () => {
expect(parseRateLimit('0 / 10')).toBe(10)
})
it('should parse K suffix', () => {
expect(parseRateLimit('0 / 250K')).toBe(250000)
})
it('should parse M suffix', () => {
expect(parseRateLimit('0 / 1M')).toBe(1000000)
})
it('should parse Unlimited', () => {
expect(parseRateLimit('0 / Unlimited')).toBe(Infinity)
})
})📝 API Reference
SwarmGem.init(config?)
Initialize a new SwarmGem instance.
Parameters:
config(optional): Configuration objectkeys?: string[]- Array of API keys (overrides env)model?: string- Model name (overrides env, defaults to 'gemini-2.5-flash')rateLimitsPath?: string- Path to custom rate limits file
Returns: SwarmGem instance
Throws: Error if no keys provided or invalid configuration
sg.generate(prompt, options?)
Generate content using Gemini API with automatic key rotation.
Parameters:
prompt: string- Text prompt to send to the modeloptions?: GenerateOptions(optional)temperature?: number- 0.0 to 1.0 (default: 0.7)maxTokens?: number- Maximum tokens (default: 8192)
Returns: Promise<string> - Generated text response
Throws: Error if all keys exhausted or non-recoverable error
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Development Setup
# Clone the repository
git clone https://github.com/yourusername/swarm-gem.git
cd swarm-gem
# Install dependencies
npm install
# Build TypeScript
npm run build
# Run tests
npm test📄 License
MIT License - see LICENSE file for details.
🔗 Links
💡 Tips & Tricks
Reducing API Calls
- Cache Results: Store responses for identical prompts
- Debounce Requests: Delay rapid requests from user input
- Batch Processing: Process multiple items in fewer requests
Monitoring Costs
Free tier limits (per key per day):
gemini-2.5-flash: 20 requests/day = 100 requests with 5 keysgemini-3-flash: 20 requests/day = 100 requests with 5 keys
Track daily usage to avoid hitting daily limits.
Scaling Strategy
| Stage | Keys Needed | Cost | Throughput | |-------|-------------|------|------------| | Development | 2-3 | Free | < 1 req/s | | Testing | 5-10 | Free | 1-2 req/s | | Small Production | 20-50 | Free | 2-5 req/s | | Medium Production | 100-200 | Free | 10-15 req/s | | Large Production | Consider Paid Tier | $$$ | Unlimited |
❓ FAQ
Q: Why do I get "All API keys exhausted" error?
A: This means all your keys have hit their rate limits. Solutions:
- Wait for the minute/day window to reset
- Add more API keys
- Reduce request frequency
- Upgrade to paid tier
Q: Can I mix free and paid tier keys?
A: Yes, but you'll need to update the rate limits config for paid keys to reflect their higher limits.
Q: What happens if one key is invalid?
A: The package automatically detects invalid keys (401/403 errors), marks them as permanently failed, and continues with remaining valid keys.
Q: Does this package cache responses?
A: No, SwarmGem focuses solely on key rotation and rate limiting. Implement caching in your application layer if needed.
Q: Can I use this with streaming responses?
A: Current version (v1.0.0) does not support streaming. This may be added in future versions.
Q: How accurate is the rate limit tracking?
A: Very accurate for request count limits (RPM, RPD). Token count limits (TPM) are not tracked, so you may still hit TPM limits if generating very long responses.
🐛 Troubleshooting
Issue: Module not found errors
# Make sure to build TypeScript first
npm run buildIssue: Environment variables not loading
# Install dotenv if needed
npm install dotenv
# Load in your app
import 'dotenv/config'
import SwarmGem from 'swarm-gem'Issue: Rate limits still being hit
- Check if you're hitting token limits (TPM) instead of request limits
- Reduce
maxTokensin generation options - Add more API keys
- Verify your keys are valid
Made with ❤️ for the Gemini API community
