gemini-ai-toolkit
v1.4.0
Published
A comprehensive toolkit for the Google Gemini API, providing easy-to-use interfaces for text, chat, image, video, audio, and grounding features with all the latest Gemini models.
Downloads
13
Maintainers
Readme
🤖 Gemini AI Toolkit
The most comprehensive, developer-friendly toolkit for Google's Gemini API
Features • Quick Start • Documentation • Examples • API Reference • Presets • Contributing
📋 Table of Contents
- Overview
- Features
- Installation
- Quick Start
- API Reference
- Presets
- Utilities
- Error Handling
- Type Definitions
- Examples
- Best Practices
- Performance Tips
- Requirements
- FAQ
- Contributing
- License
- Support
🎯 Overview
Gemini AI Toolkit is a production-ready, TypeScript-first npm package that provides a clean, intuitive interface to Google's powerful Gemini API. Built with developer experience in mind, it offers:
- 🚀 One-line functions for minimal code usage
- 🎨 79 preset configurations for common use cases
- 🛠️ Developer-friendly utilities for file operations
- 🤖 Smart helpers with auto-detection and auto-retry
- 📚 File Search (RAG) for querying your documents
- 📦 Zero dependencies (only
@google/genaias peer dependency) - 🔒 Full TypeScript support with strict type checking
- ⚡ Auto API key detection from environment variables
- 🎯 Comprehensive error handling with helpful messages
- 📚 24 detailed examples covering all features
Why Gemini AI Toolkit?
| Feature | This Package | Others | |---------|-------------|--------| | Code Required | 1 line | 3-5 lines | | Presets | 79 ready-to-use | Manual config | | Type Safety | 100% TypeScript | Partial | | Utilities | Built-in | External libs | | Error Messages | Actionable tips | Generic | | Documentation | Comprehensive | Basic | | Examples | 24 examples | Few/None |
✨ Features
🎯 Core Capabilities
| Feature | Description | Models Supported |
|---------|-------------|------------------|
| 📝 Text Generation | Generate text with latest Gemini models | gemini-2.5-flash, gemini-2.5-pro |
| 💬 Chat Conversations | Create and manage chat sessions with context | All text models |
| 🖼️ Image Generation | Generate images with Imagen 4.0 | imagen-4.0-generate-001 |
| 🎨 Image Editing | Edit images with text prompts | Imagen models |
| 🔍 Image Understanding | Analyze and understand image content | gemini-2.5-flash-image |
| 🎬 Video Generation | Generate videos from images with Veo 3.1 | veo-3.1-fast-generate-preview |
| 📹 Video Understanding | Analyze video content frame-by-frame | All vision models |
| 🔊 Text-to-Speech | Convert text to natural speech | gemini-2.5-flash-preview-tts |
| 🎤 Live Conversations | Real-time audio conversations | gemini-2.5-flash-native-audio-preview-09-2025 |
| 🌐 Grounded Search | Get up-to-date answers from Google Search | All text models |
| 🗺️ Grounded Maps | Find location-based information | All text models |
| 📚 File Search (RAG) | Query your documents with Retrieval Augmented Generation | gemini-2.5-flash, gemini-2.5-pro |
| 🔗 URL Context | Analyze content from web pages, PDFs, and URLs | gemini-2.5-flash, gemini-2.5-pro |
| 🧠 Thinking Mode | Tackle complex problems with extended thinking | gemini-2.5-pro |
| 📁 Files API | Upload, manage, and use media files (images, videos, audio, documents) | All multimodal models |
| 💾 Context Caching | Cache content to reduce costs on repeated requests | gemini-2.0-flash-001, gemini-2.5-flash, gemini-2.5-pro |
| 🔢 Token Counting | Count tokens for any content before sending to API | All models |
| 🎵 Lyria RealTime | Real-time streaming music generation with interactive control | models/lyria-realtime-exp (experimental) |
🎁 Developer Experience Features
- ⚡ Quick Functions: One-liner functions for common operations
- 🎨 79 Presets: Pre-configured options for all use cases
- 🛠️ Utilities: File operations, batch processing, streaming helpers
- 🔄 Auto-initialization: Automatic API key detection
- 📦 Minimal Dependencies: Only 1 production dependency
- 🎯 Type Safety: Full TypeScript support with strict mode
- 📚 Comprehensive Docs: Detailed documentation and examples
📦 Installation
Prerequisites
- Node.js >= 18.0.0
- npm or yarn or pnpm
- Google Gemini API Key (Get one here)
Install
# Using npm
npm install gemini-ai-toolkit
# Using yarn
yarn add gemini-ai-toolkit
# Using pnpm
pnpm add gemini-ai-toolkitGet Your API Key
- Visit Google AI Studio
- Sign in with your Google account
- Create a new API key
- Copy and store it securely
🚀 Quick Start
Option 1: Service-Based Architecture (Recommended) 🏗️
Perfect for applications requiring maintainable, modular code
import { GeminiToolkit, CoreAIService, ChatService, GroundingService } from 'gemini-ai-toolkit';
// Initialize toolkit once
const toolkit = new GeminiToolkit({
apiKey: 'your-api-key-here' // or set GEMINI_API_KEY env var
});
// Access service instances directly
const { coreAI, chat, grounding, fileSearch, files, cache, tokens } = toolkit;
// Core AI operations
const text = await coreAI.generateText('Explain quantum computing in simple terms');
const image = await coreAI.generateImage('A futuristic robot in a cyberpunk city');
// Chat conversations
const chatSession = chat.createChat('gemini-2.5-pro');
const response = await chatSession.sendMessage({ message: 'Hello!' });
// Grounded search
const searchResults = await grounding.groundWithSearch('Latest AI developments in 2024');
// File Search (RAG) - query your documents
const store = await fileSearch.createFileSearchStore('my-documents');
const operation = await fileSearch.uploadToFileSearchStore('document.pdf', store.name);
// Wait for operation.done, then query:
const answer = await fileSearch.queryWithFileSearch('Tell me about X', {
fileSearchStoreNames: [store.name]
});
// Files API - upload and use files
const file = await files.uploadFile('image.jpg', { displayName: 'My Image' });
const analysis = await coreAI.generateText('Describe this image', { files: [file] });
// Context Caching - reduce costs on repeated requests
const cacheObj = await cache.createCache('gemini-2.0-flash-001', {
systemInstruction: 'You are a helpful assistant.',
contents: [file],
ttl: '300s'
});
const cachedResult = await coreAI.generateText('What is this?', { cachedContent: cacheObj.name });
// Token Counting - estimate costs
const tokenCount = await tokens.countTokens('Hello, world!');
console.log(`Tokens: ${tokenCount.totalTokens}`);
// Live conversations with ephemeral tokens
const token = await chat.createEphemeralToken({
uses: 1,
expireTime: new Date(Date.now() + 30 * 60 * 1000) // 30 minutes
});
const liveSession = await chat.connectLive({
onmessage: async (message) => console.log('Received:', message),
onerror: (error) => console.error('Error:', error),
onclose: () => console.log('Session closed')
}, {}, token.name);
// Lyria RealTime music generation (experimental, requires v1alpha)
const musicSession = await chat.connectMusic({
onmessage: async (message) => {
if (message.serverContent?.audioChunks) {
// Process audio chunks (16-bit PCM, 48kHz, stereo)
}
},
onerror: (error) => console.error('Error:', error),
onclose: () => console.log('Session closed')
});
await musicSession.setWeightedPrompts({
weightedPrompts: [{ text: 'minimal techno', weight: 1.0 }]
});
await musicSession.setMusicGenerationConfig({
musicGenerationConfig: { bpm: 90, temperature: 1.0 }
});
await musicSession.play();Option 2: One-Line Functions ⚡
Perfect for quick scripts and minimal code usage
import { generateText, generateImage, search, queryWithUrlContext, createFileSearchStore, uploadToFileSearchStore, queryFileSearch } from 'gemini-ai-toolkit';
// Set GEMINI_API_KEY environment variable
// export GEMINI_API_KEY="your-api-key-here"
// One line - that's it!
const text = await generateText('Explain quantum computing in simple terms');
const image = await generateImage('A futuristic robot in a cyberpunk city');
const results = await search('Latest AI developments in 2024');
// File Search (RAG) - query your documents
const store = await createFileSearchStore('my-documents');
const operation = await uploadToFileSearchStore('document.pdf', store.name);
// Wait for operation.done, then query:
const answer = await queryFileSearch('Tell me about X', {
fileSearchStoreNames: [store.name]
});
// Files API - upload and use files
const file = await uploadFile('image.jpg', { displayName: 'My Image' });
const result = await generateText('Describe this image', { files: [file] });
// Context Caching - reduce costs on repeated requests
const cache = await createCache('gemini-2.0-flash-001', {
systemInstruction: 'You are a helpful assistant.',
contents: [file],
ttl: 300
});
const cachedResult = await generateText('What is this?', { cachedContent: cache.name });
// Token Counting - estimate costs
const tokenCount = await countTokens('Hello, world!');
console.log(`Tokens: ${tokenCount.totalTokens}`);
// Lyria RealTime - generate music (experimental, requires v1alpha)
const musicSession = await connectMusic({
onmessage: async (message) => {
if (message.serverContent?.audioChunks) {
// Process audio chunks (16-bit PCM, 48kHz, stereo)
}
},
onerror: (error) => console.error('Error:', error),
onclose: () => console.log('Session closed')
});
await musicSession.setWeightedPrompts({
weightedPrompts: [{ text: 'minimal techno', weight: 1.0 }]
});
await musicSession.setMusicGenerationConfig({
musicGenerationConfig: { bpm: 90, temperature: 1.0 }
});
await musicSession.play();Option 3: Initialize Once, Use Everywhere
Best for applications with multiple API calls
import { init, generateText, generateImage } from 'gemini-ai-toolkit';
// Initialize once at app startup
init('your-api-key-here');
// Now use anywhere - no API key needed!
const text1 = await generateText('First prompt');
const text2 = await generateText('Second prompt');
const image = await generateImage('A robot');Option 3: Full Class API (Maximum Control)
Best for complex applications needing fine-grained control
import { GeminiToolkit } from 'gemini-ai-toolkit';
const toolkit = new GeminiToolkit({
apiKey: 'your-api-key-here'
});
const text = await toolkit.generateText('Hello, world!');
const chat = toolkit.createChat('gemini-2.5-pro');Environment Variable Setup
# Linux/macOS
export GEMINI_API_KEY="your-api-key-here"
# Windows (PowerShell)
$env:GEMINI_API_KEY="your-api-key-here"
# Windows (CMD)
set GEMINI_API_KEY=your-api-key-here
# Or use a .env file (recommended)
echo "GEMINI_API_KEY=your-api-key-here" > .env💡 Examples
The toolkit includes comprehensive examples demonstrating different usage patterns. Run them with:
# List all available examples
npm run examples
# Run the basic service-based example (recommended)
npm run example:basic
# Run the advanced patterns example
npm run example:advanced
# Run the migration guide
npm run example:migrationExample Categories
examples/service-based-example.ts- New modular architecture (recommended)examples/advanced-service-example.ts- Advanced patterns and real-world usageexamples/migration-example.ts- Migrating from monolithic to service-based
See examples/README.md for detailed documentation of all examples.
📚 API Reference
Quick Functions (One-Liners)
All quick functions automatically detect your API key from GEMINI_API_KEY environment variable or use the cached instance from init().
generateText(prompt, options?, apiKey?)
Generate text content with a single line of code.
import { generateText, presets } from 'gemini-ai-toolkit';
// Basic usage
const text = await generateText('What is artificial intelligence?');
// With options
const text = await generateText('Explain quantum computing', {
model: 'gemini-2.5-pro',
config: { temperature: 0.7, maxOutputTokens: 2000 }
});
// With preset
const text = await generateText('Quick answer', presets.text.fast);
// With explicit API key
const text = await generateText('Hello!', undefined, 'your-api-key');Parameters:
prompt(string, required): The text promptoptions(GenerateTextOptions, optional): Configuration optionsapiKey(string, optional): API key (overrides env var)
Returns: Promise<string> - Generated text
generateImage(prompt, options?, apiKey?)
Generate images with Imagen 4.0.
import { generateImage, presets, saveImage } from 'gemini-ai-toolkit';
// Basic usage
const imageBase64 = await generateImage('A robot with a skateboard');
// With preset
const imageBase64 = await generateImage('A landscape', presets.image.wide);
// Save to file
saveImage(imageBase64, 'output.png');Parameters:
prompt(string, required): Image descriptionoptions(GenerateImageOptions, optional): Configuration optionsapiKey(string, optional): API key
Returns: Promise<string> - Base64 encoded image
createChat(model?, apiKey?)
Create a chat session for conversational interactions.
import { createChat, presets } from 'gemini-ai-toolkit';
// Basic usage
const chat = createChat();
// With model
const chat = createChat('gemini-2.5-pro');
// With preset
const chat = createChat(presets.chat.professional);
// Use the chat
const response = await chat.sendMessage({ message: 'Hello!' });
console.log(response.text);
// Streaming
const stream = await chat.sendMessageStream({ message: 'Tell a story' });
for await (const chunk of stream) {
process.stdout.write(chunk.text);
}Returns: Chat instance
generateSpeech(text, options?, apiKey?)
Convert text to natural speech.
import { generateSpeech, saveAudio, presets } from 'gemini-ai-toolkit';
// Basic usage
const audioBase64 = await generateSpeech('Hello, world!');
// With preset
const audioBase64 = await generateSpeech('Welcome!', presets.speech.narration);
// Save to file
saveAudio(audioBase64, 'output.wav');Returns: Promise<string> - Base64 encoded audio
search(query, apiKey?)
Search the web and get grounded answers.
import { search } from 'gemini-ai-toolkit';
const result = await search('Latest AI developments in 2024');
console.log(result.text);Returns: Promise<GroundedResult> - Search results with citations
findNearby(query, location, apiKey?)
Find nearby places using Google Maps.
import { findNearby } from 'gemini-ai-toolkit';
const places = await findNearby('restaurants', {
latitude: 37.7749,
longitude: -122.4194
});
console.log(places.text);Returns: Promise<GroundedResult> - Location-based results
analyzeImage(imageBase64, prompt, mimeType, options?, apiKey?)
Analyze image content.
import { analyzeImage, loadImage, presets } from 'gemini-ai-toolkit';
const imageBase64 = await loadImage('photo.jpg');
const analysis = await analyzeImage(
imageBase64,
'What is in this image?',
'image/jpeg',
presets.analysis.detailed
);Returns: Promise<string> - Analysis text
editImage(imageBase64, mimeType, prompt, apiKey?)
Edit images with text prompts.
import { editImage, loadImage, saveImage } from 'gemini-ai-toolkit';
const imageBase64 = await loadImage('input.png');
const edited = await editImage(
imageBase64,
'image/png',
'Add a sunset in the background'
);
saveImage(edited, 'output.png');Returns: Promise<string> - Base64 encoded edited image
init(apiKey)
Initialize the toolkit once for use with quick functions.
import { init, generateText } from 'gemini-ai-toolkit';
// Initialize once
init('your-api-key-here');
// Now all quick functions work without passing API key
const text = await generateText('Hello!');getToolkit()
Get the default toolkit instance.
import { getToolkit } from 'gemini-ai-toolkit';
const toolkit = getToolkit();
// Use toolkit methods directlyqueryFileSearch(prompt, config, model?, apiKey?)
Query your documents with File Search (RAG) for accurate, context-aware answers.
import { queryFileSearch, createFileSearchStore, uploadToFileSearchStore } from 'gemini-ai-toolkit';
// Create a File Search store
const store = await createFileSearchStore('my-documents');
// Upload a file (wait for operation to complete)
const operation = await uploadToFileSearchStore('document.pdf', store.name);
// Poll operation.done until true...
// Query your documents
const result = await queryFileSearch('Tell me about Robert Graves', {
fileSearchStoreNames: [store.name]
});
console.log(result.text);Parameters:
prompt(string, required): The query or promptconfig(FileSearchQueryConfig, required): File Search configurationfileSearchStoreNames(string[], required): Array of File Search store namesmetadataFilter(string, optional): Metadata filter (e.g.,'author="Robert Graves"')
model(string, optional): Model name (default:'gemini-2.5-flash')apiKey(string, optional): API key
Returns: Promise<GroundedResult> - Query results with citations
createFileSearchStore(displayName?, apiKey?)
Create a new File Search store for RAG.
import { createFileSearchStore } from 'gemini-ai-toolkit';
const store = await createFileSearchStore('my-documents');
console.log(store.name); // Use this name for uploads and queriesReturns: Promise<FileSearchStore> - Created File Search store
createEphemeralToken(config?, apiKey?)
Create ephemeral token for secure Live API access (server-side only).
import { createEphemeralToken } from 'gemini-ai-toolkit';
// Server-side: Create token
const token = await createEphemeralToken({
uses: 1,
expireTime: new Date(Date.now() + 30 * 60 * 1000), // 30 minutes
newSessionExpireTime: new Date(Date.now() + 60 * 1000), // 1 minute
liveConnectConstraints: {
model: 'gemini-2.5-flash-native-audio-preview-09-2025',
config: {
temperature: 0.7,
responseModalities: ['AUDIO']
}
}
});
// Send token.name to client for use with connectLive()Parameters:
config(EphemeralTokenConfig, optional): Token configurationuses(number, optional): Number of uses (default: 1)expireTime(Date | string, optional): Expiration (default: 30 minutes)newSessionExpireTime(Date | string, optional): New session expiration (default: 1 minute)liveConnectConstraints(object, optional): Lock token to specific config
apiKey(string, optional): API key
Returns: Promise<EphemeralToken> - Token with name property (use as API key)
Note: ⚠️ Server-side only. Ephemeral tokens enhance security for client-side Live API access.
Files API Quick Functions
uploadFile(filePath, config?, apiKey?)
Quick file upload - minimal code!
import { uploadFile } from 'gemini-ai-toolkit';
const file = await uploadFile('document.pdf', { displayName: 'My Document' });Returns: Promise<FileObject>
getFile(fileName, apiKey?)
Quick file metadata retrieval - minimal code!
import { getFile } from 'gemini-ai-toolkit';
const metadata = await getFile('files/my-file-123');Returns: Promise<FileObject>
listFiles(pageSize?, apiKey?)
Quick file listing - minimal code!
import { listFiles } from 'gemini-ai-toolkit';
const files = await listFiles(10);
for await (const file of files) {
console.log(file.name);
}Returns: Promise<Iterable<FileObject>>
deleteFile(fileName, apiKey?)
Quick file deletion - minimal code!
import { deleteFile } from 'gemini-ai-toolkit';
await deleteFile('files/my-file-123');Returns: Promise<void>
Context Caching Quick Functions
createCache(model, config, apiKey?)
Quick cache creation - minimal code!
import { createCache, uploadFile } from 'gemini-ai-toolkit';
const file = await uploadFile('video.mp4');
const cache = await createCache('gemini-2.0-flash-001', {
displayName: 'my-cache',
contents: [file],
ttl: 300 // 5 minutes
});Returns: Promise<CachedContent>
listCaches(apiKey?)
Quick cache listing - minimal code!
import { listCaches } from 'gemini-ai-toolkit';
const caches = await listCaches();
for await (const cache of caches) {
console.log(cache.name);
}Returns: Promise<Iterable<CachedContent>>
getCache(cacheName, apiKey?)
Quick cache retrieval - minimal code!
import { getCache } from 'gemini-ai-toolkit';
const cache = await getCache('cachedContents/my-cache-123');Returns: Promise<CachedContent>
updateCache(cacheName, config, apiKey?)
Quick cache update - minimal code!
import { updateCache } from 'gemini-ai-toolkit';
await updateCache('cachedContents/my-cache-123', { ttl: '600s' });Returns: Promise<CachedContent>
deleteCache(cacheName, apiKey?)
Quick cache deletion - minimal code!
import { deleteCache } from 'gemini-ai-toolkit';
await deleteCache('cachedContents/my-cache-123');Returns: Promise<void>
Token Counting Quick Functions
countTokens(contents, model?, apiKey?)
Quick token counting - minimal code!
import { countTokens } from 'gemini-ai-toolkit';
const count = await countTokens('Hello, world!');
console.log(count.totalTokens);Returns: Promise<TokenCount>
connectMusic(callbacks, apiKey?)
Quick music session connection - minimal code!
import { connectMusic } from 'gemini-ai-toolkit';
const session = await connectMusic({
onmessage: async (message) => {
// Handle audio chunks
},
onerror: (error) => console.error('Error:', error),
onclose: () => console.log('Closed')
});Returns: Promise<MusicSession> - Music session object
Note: ⚠️ Experimental model, requires v1alpha API.
uploadToFileSearchStore(filePath, fileSearchStoreName, config?, apiKey?)
Upload a file directly to a File Search store (combines upload and import).
import { uploadToFileSearchStore } from 'gemini-ai-toolkit';
const operation = await uploadToFileSearchStore(
'document.pdf',
store.name,
{
displayName: 'My Document',
customMetadata: [
{ key: 'author', stringValue: 'Robert Graves' },
{ key: 'year', numericValue: 1934 }
],
chunkingConfig: {
whiteSpaceConfig: {
maxTokensPerChunk: 200,
maxOverlapTokens: 20
}
}
}
);
// Poll operation.done until true
while (!operation.done) {
await new Promise(resolve => setTimeout(resolve, 5000));
operation = await getClient().operations.get({ operation });
}Parameters:
filePath(string, required): Path to the file to uploadfileSearchStoreName(string, required): Name of the File Search storeconfig(FileSearchUploadConfig, optional): Upload configurationdisplayName(string, optional): Display name for the filecustomMetadata(FileMetadata[], optional): Custom metadatachunkingConfig(ChunkingConfig, optional): Chunking configuration
apiKey(string, optional): API key
Returns: Promise<Operation> - Operation that can be polled for completion
queryWithUrlContext(prompt, model?, apiKey?)
Query content from URLs using the URL Context tool. URLs should be included in the prompt text.
import { queryWithUrlContext } from 'gemini-ai-toolkit';
const result = await queryWithUrlContext(
'Compare the ingredients from https://example.com/recipe1 and https://example.com/recipe2'
);
console.log(result.text);
// Access URL retrieval metadata
const urlMetadata = result.candidates?.[0]?.urlContextMetadata;
console.log(urlMetadata);Parameters:
prompt(string, required): The prompt containing URLs to analyzemodel(string, optional): Model name (default:'gemini-2.5-flash')apiKey(string, optional): API key
Returns: Promise<GroundedResult> - Query results with URL metadata
Note: Up to 20 URLs can be processed per request. Maximum 34MB per URL.
queryWithUrlContextAndSearch(prompt, model?, apiKey?)
Query with both URL Context and Google Search tools enabled.
import { queryWithUrlContextAndSearch } from 'gemini-ai-toolkit';
const result = await queryWithUrlContextAndSearch(
'Find AI trends and analyze https://example.com/ai-report'
);Returns: Promise<GroundedResult> - Combined search and URL analysis results
Class API (GeminiToolkit)
For applications needing more control, use the class API:
📚 API Reference
Service-Based Architecture (Recommended)
The toolkit uses a modular service-based architecture for better maintainability and separation of concerns. Each service handles a specific domain of functionality.
import { GeminiToolkit, CoreAIService, ChatService, GroundingService } from 'gemini-ai-toolkit';
// Initialize toolkit once
const toolkit = new GeminiToolkit({
apiKey: 'your-api-key-here'
});
// Access service instances
const { coreAI, chat, grounding, fileSearch, files, cache, tokens } = toolkit;Constructor
new GeminiToolkit(config: GeminiToolkitConfig)Config:
apiKey(string, required): Your Gemini API key
Service Properties
The GeminiToolkit class provides the following service instances:
coreAI: CoreAIService- Text, image, video, speech generationchat: ChatService- Chat conversations, live sessions, ephemeral tokensgrounding: GroundingService- Google Search, Maps, URL contextfileSearch: FileSearchService- File Search (RAG) operationsfiles: FilesService- File upload/management operationscache: CacheService- Context caching operationstokens: TokenService- Token counting operations
CoreAIService
Handles core AI generation operations including text, images, videos, and speech.
// Text generation
const text = await coreAI.generateText('Explain quantum computing', {
model: 'gemini-2.5-pro',
config: { temperature: 0.7 }
});
// Image generation
const imageB64 = await coreAI.generateImage('A futuristic robot', {
aspectRatio: '16:9',
personGeneration: 'allow_adult'
});
// Video generation (from image)
const videoResult = await coreAI.generateVideo(imageB64, 'image/jpeg', 'Make it dance', {
durationSeconds: 4,
fps: 30
});
// Image editing
const editedImage = await coreAI.editImage(existingImageB64, 'image/jpeg', 'Add a hat');
// Media analysis
const analysis = await coreAI.analyzeMedia(imageB64, 'image/jpeg', 'What do you see?');
// Speech synthesis
const audioB64 = await coreAI.generateSpeech('Hello, world!', {
voiceName: 'Puck',
languageCode: 'en-US'
});Methods:
generateText(prompt, options?)- Generate text contentgenerateImage(prompt, options?)- Generate imageseditImage(imageB64, mimeType, prompt, model?)- Edit existing imagesanalyzeMedia(data, mimeType, prompt, options?)- Analyze images/videos/audiogenerateVideo(imageB64, mimeType, prompt, options?)- Generate videos from imagesgenerateSpeech(text, options?)- Generate speech audio
ChatService
Manages chat conversations, live sessions, and ephemeral tokens.
// Create chat sessions
const chat = chatService.createChat('gemini-2.5-pro');
const response = await chat.sendMessage({ message: 'Hello!' });
// Ephemeral tokens for live sessions
const token = await chatService.createEphemeralToken({
uses: 1,
expireTime: new Date(Date.now() + 30 * 60 * 1000) // 30 minutes
});
// Live conversation sessions
const liveSession = await chatService.connectLive({
onmessage: async (message) => console.log('Received:', message),
onerror: (error) => console.error('Error:', error),
onclose: () => console.log('Session closed')
}, {
model: 'gemini-2.0-flash-exp',
responseModalities: ['text']
}, token.name);
// Music generation (experimental)
const musicSession = await chatService.connectMusic({
onmessage: async (message) => {
if (message.serverContent?.audioChunks) {
// Process 16-bit PCM audio chunks at 48kHz stereo
}
},
onerror: (error) => console.error('Error:', error),
onclose: () => console.log('Session closed')
});Methods:
createChat(model?)- Create a chat sessioncreateEphemeralToken(config?)- Create ephemeral tokens for live sessionsconnectLive(callbacks, options?, ephemeralToken?)- Start live conversationconnectMusic(callbacks, apiKey?)- Start music generation session
GroundingService
Provides grounding capabilities with Google Search, Maps, and URL context.
// Ground with Google Search
const searchResult = await grounding.groundWithSearch(
'Latest developments in quantum computing',
'gemini-2.5-pro'
);
console.log(searchResult.text); // Grounded response
console.log(searchResult.candidates[0].citationMetadata?.citations); // Citations
// Ground with Google Maps
const mapsResult = await grounding.groundWithMaps(
'Find Italian restaurants near Central Park',
{ latitude: 40.7829, longitude: -73.9654 },
'gemini-2.5-pro'
);
// Generate with URL context
const urlResult = await grounding.generateWithUrlContext(
'Summarize the main points from this article',
'gemini-2.5-pro'
);
// Combine URL context with search
const combinedResult = await grounding.generateWithUrlContextAndSearch(
'Compare the information from the URL with current developments',
'gemini-2.5-pro'
);Methods:
groundWithSearch(prompt, model?)- Generate with Google Search groundinggroundWithMaps(prompt, location, model?)- Generate with Google Maps groundinggenerateWithUrlContext(prompt, model?)- Generate with URL contextgenerateWithUrlContextAndSearch(prompt, model?)- Generate with URL context + search
FileSearchService
Manages File Search (Retrieval Augmented Generation) operations.
// Create a file search store
const store = await fileSearch.createFileSearchStore('my-documents');
console.log(`Store created: ${store.name}`);
// Upload files to the store
const operation = await fileSearch.uploadToFileSearchStore(
'document.pdf',
store.name,
{ mimeType: 'application/pdf' }
);
// Wait for processing to complete
// ... (polling logic)
// Query the store
const answer = await fileSearch.queryWithFileSearch(
'What are the key findings?',
{
fileSearchStoreNames: [store.name],
maxNumResults: 5,
resultThreshold: 0.7
},
'gemini-2.5-pro'
);
// Import existing files
await fileSearch.importFileToFileSearchStore(
store.name,
'files/document.pdf',
{ mimeType: 'application/pdf' }
);
// List and manage stores
const stores = await fileSearch.listFileSearchStores();
const storeInfo = await fileSearch.getFileSearchStore(store.name);
await fileSearch.deleteFileSearchStore(store.name);Methods:
createFileSearchStore(displayName?)- Create a new file search storelistFileSearchStores()- List all file search storesgetFileSearchStore(name)- Get store detailsdeleteFileSearchStore(name, force?)- Delete a storeuploadToFileSearchStore(file, storeName, config?, apiKey?)- Upload file to storeimportFileToFileSearchStore(storeName, fileName, config?)- Import existing filequeryWithFileSearch(prompt, config, model?)- Query files with RAG
FilesService
Handles file upload, retrieval, listing, and deletion operations.
// Upload files
const file = await files.uploadFile('image.jpg', {
displayName: 'My Image',
mimeType: 'image/jpeg'
});
console.log(`Uploaded: ${file.name}`);
// Get file information
const fileInfo = await files.getFile(file.name);
console.log(`State: ${fileInfo.state}, Size: ${fileInfo.sizeBytes} bytes`);
// List files
const allFiles = await files.listFiles(10); // max 10 results
allFiles.files.forEach(f => console.log(`${f.name}: ${f.displayName}`));
// Delete files
await files.deleteFile(file.name);Methods:
uploadFile(filePath, config?)- Upload a filegetFile(fileName)- Get file informationlistFiles(pageSize?)- List uploaded filesdeleteFile(fileName)- Delete a file
CacheService
Manages context caching for cost reduction on repeated requests.
// Create a cache
const cache = await cache.createCache('gemini-2.0-flash-001', {
systemInstruction: 'You are a helpful assistant specializing in JavaScript.',
contents: [
{
role: 'user',
parts: [{ text: 'Explain closures in JavaScript.' }]
},
{
role: 'model',
parts: [{ text: 'Closures are...' }]
}
],
ttl: '3600s' // 1 hour
});
// Use cached content
const response = await coreAI.generateText(
'Give me an example of a closure',
{ cachedContent: cache.name }
);
// List and manage caches
const caches = await cache.listCaches();
const cacheInfo = await cache.getCache(cache.name);
await cache.updateCache(cache.name, { ttl: '7200s' }); // Extend TTL
await cache.deleteCache(cache.name);Methods:
createCache(model, config)- Create a new cachelistCaches()- List all cachesgetCache(cacheName)- Get cache detailsupdateCache(cacheName, config)- Update cache settingsdeleteCache(cacheName)- Delete a cache
TokenService
Provides token counting for cost estimation.
// Count tokens in text
const count = await tokens.countTokens('Hello, world!');
console.log(`Total tokens: ${count.totalTokens}`);
// Count tokens with model context
const countWithModel = await tokens.countTokens(
'Explain quantum computing',
'gemini-2.5-pro'
);
// Count tokens for multimodal content
const multimodalCount = await tokens.countTokens([
{ text: 'Describe this image:' },
{ inlineData: { mimeType: 'image/jpeg', data: imageBase64 } }
], 'gemini-2.5-pro');Methods:
countTokens(contents, model?)- Count tokens in content
Legacy Direct Methods (Deprecated)
For backward compatibility, the GeminiToolkit class still provides direct methods, but these are deprecated. Use the service instances instead.
Deprecated Methods
generateText(prompt, options?)
Generate text content.
const text = await toolkit.generateText('Hello, world!', {
model: 'gemini-2.5-pro',
config: {
temperature: 0.7,
maxOutputTokens: 2000,
topP: 0.95,
topK: 40
}
});Options:
model(string): Model name (default:'gemini-2.5-flash')config(object): Additional model configuration
createChat(model?)
Create a chat session.
const chat = toolkit.createChat('gemini-2.5-pro');
// Send message
const response = await chat.sendMessage({
message: 'Hello!'
});
// Streaming
const stream = await chat.sendMessageStream({
message: 'Tell a story'
});
for await (const chunk of stream) {
console.log(chunk.text);
}Chat Methods:
sendMessage({ message })- Send a message and get responsesendMessageStream({ message })- Stream response chunks
generateImage(prompt, options?)
Generate images.
const imageBase64 = await toolkit.generateImage(
'A futuristic city at sunset',
{
aspectRatio: '16:9',
outputMimeType: 'image/png',
numberOfImages: 1
}
);Options:
model(string): Model name (default:'imagen-4.0-generate-001')aspectRatio(ImageAspectRatio):'1:1','16:9','9:16','4:3','3:4'numberOfImages(number): 1-4 (default: 1)outputMimeType(string):'image/png','image/jpeg','image/webp'
editImage(imageBase64, mimeType, prompt, model?)
Edit images.
const edited = await toolkit.editImage(
imageBase64,
'image/png',
'Apply a retro 80s filter with warm tones'
);analyzeMedia(data, mimeType, prompt, options?)
Analyze images, video frames, or audio.
// Single image
const analysis = await toolkit.analyzeMedia(
imageBase64,
'image/png',
'What is in this image?'
);
// Multiple frames (video)
const frames = [frame1, frame2, frame3];
const analysis = await toolkit.analyzeMedia(
frames,
'image/jpeg',
'Describe the video content',
{ isVideo: true }
);Options:
model(string): Model nameisVideo(boolean): Set totruefor video analysis
generateVideo(imageBase64, mimeType, prompt, options?)
Generate videos from images.
const operation = await toolkit.generateVideo(
imageBase64,
'image/png',
'Make the scene come alive with gentle movement',
{
aspectRatio: '16:9',
resolution: '1080p'
}
);
// Poll for completion
// Note: Video generation is asynchronousOptions:
model(string): Model name (default:'veo-3.1-fast-generate-preview')aspectRatio(VideoAspectRatio):'16:9'or'9:16'resolution(string):'720p'or'1080p'numberOfVideos(number): 1
Returns: Operation object (poll for completion)
generateSpeech(text, options?)
Convert text to speech.
const audioBase64 = await toolkit.generateSpeech('Hello, world!', {
voiceName: 'Kore',
model: 'gemini-2.5-flash-preview-tts'
});Options:
model(string): Model namevoiceName(string):'Kore'or'Zephyr'
createEphemeralToken(config?)
Create an ephemeral token for secure Live API access from client-side applications.
⚠️ Server-side only - Call this from your backend, not client-side.
// Server-side: Create ephemeral token
const token = await toolkit.createEphemeralToken({
uses: 1, // Token can only be used once
expireTime: new Date(Date.now() + 30 * 60 * 1000), // 30 minutes
newSessionExpireTime: new Date(Date.now() + 60 * 1000), // 1 minute
liveConnectConstraints: {
model: 'gemini-2.5-flash-native-audio-preview-09-2025',
config: {
temperature: 0.7,
responseModalities: ['AUDIO']
}
}
});
// Send token.name to client
// Client uses token.name as API key for connectLive()Options:
uses(number): Number of times token can be used (default: 1)expireTime(Date | string): Token expiration (default: 30 minutes)newSessionExpireTime(Date | string): New session expiration (default: 1 minute)liveConnectConstraints(object): Lock token to specific config
Returns: EphemeralToken with name property (use as API key)
connectLive(callbacks, options?, ephemeralToken?)
Connect to live conversation session. Can use standard API key or ephemeral token.
// Basic usage with standard API key
const session = await toolkit.connectLive({
onopen: () => console.log('Connected'),
onmessage: async (message) => {
console.log('Received:', message);
},
onerror: (error) => console.error('Error:', error),
onclose: () => console.log('Disconnected')
});
// Using ephemeral token (client-side, enhanced security)
const session = await toolkit.connectLive(
{
onopen: () => console.log('Connected'),
onmessage: async (message) => {
console.log('Received:', message);
},
onerror: (error) => console.error('Error:', error),
onclose: () => console.log('Disconnected')
},
{}, // options
ephemeralToken.name // Token from server
);
// With function calling
const session = await toolkit.connectLive({
onopen: () => console.log('Connected'),
onmessage: async (message) => {
// Handle tool calls
if (message.toolCall) {
const functionResponses = [];
for (const fc of message.toolCall.functionCalls) {
functionResponses.push({
id: fc.id,
name: fc.name,
response: { result: 'ok' }
});
}
await session.sendToolResponse({ functionResponses });
}
},
onerror: (error) => console.error('Error:', error),
onclose: () => console.log('Disconnected')
}, {
tools: [{
functionDeclarations: [
{ name: 'turn_on_lights' },
{ name: 'turn_off_lights', behavior: 'NON_BLOCKING' }
]
}]
});
// With Google Search
const session = await toolkit.connectLive({
onopen: () => console.log('Connected'),
onmessage: async (message) => {
// Handle search results
if (message.serverContent?.modelTurn?.parts) {
for (const part of message.serverContent.modelTurn.parts) {
if (part.executableCode) {
console.log('Code:', part.executableCode.code);
}
}
}
},
onerror: (error) => console.error('Error:', error),
onclose: () => console.log('Disconnected')
}, {
tools: [{ googleSearch: {} }]
});
// With session management
const session = await toolkit.connectLive({
onopen: () => console.log('Connected'),
onmessage: async (message) => {
// Handle session resumption updates
if (message.sessionResumptionUpdate?.newHandle) {
// Save handle for resuming session
const newHandle = message.sessionResumptionUpdate.newHandle;
}
// Handle GoAway message
if (message.goAway) {
console.log('Connection closing soon:', message.goAway.timeLeft);
}
// Handle generation complete
if (message.serverContent?.generationComplete) {
console.log('Generation complete');
}
},
onerror: (error) => console.error('Error:', error),
onclose: () => console.log('Disconnected')
}, {
contextWindowCompression: { slidingWindow: {} },
sessionResumption: { handle: previousSessionHandle }
});
// Send audio
await session.sendAudio(audioData);
// Send text
session.sendClientContent({ turns: 'Hello!', turnComplete: true });
// Close session
await session.close();Callbacks:
onopen(): Called when connection opensonmessage(message): Called when message received- Check
message.toolCallfor function calls - Check
message.serverContentfor model responses - Check
message.sessionResumptionUpdatefor resumption tokens - Check
message.goAwayfor connection termination warnings
- Check
onerror(error): Called on erroronclose(event): Called when connection closes
Options:
model(string): Model name (default: Live model)voiceName(string): Voice name (default: 'Zephyr')responseModalities(Modality[]): Response modalities (default: ['AUDIO'])tools(LiveTool[]): Tools to enable (function calling, Google Search)inputAudioTranscription(boolean): Enable input audio transcriptionoutputAudioTranscription(boolean): Enable output audio transcriptioncontextWindowCompression(ContextWindowCompressionConfig): Enable compression for longer sessionssessionResumption(SessionResumptionConfig): Configure session resumptionrealtimeInputConfig(RealtimeInputConfig): Configure VAD settingsthinkingConfig(ThinkingConfig): Configure thinking budgetenableAffectiveDialog(boolean): Enable affective dialog (requires v1alpha)proactivity(ProactivityConfig): Configure proactive audiomediaResolution(MediaResolution): Set media resolutiontemperature(number): Temperature setting
Ephemeral Token:
- Pass
ephemeralToken.nameas third parameter for client-side security - Ephemeral tokens are short-lived and reduce security risks
Tool Use:
- Function calling: Define functions in
tools[].functionDeclarations - Google Search: Enable with
tools: [{ googleSearch: {} }] - Handle tool calls in
onmessagecallback - Respond with
session.sendToolResponse({ functionResponses })
Session Management:
- Context window compression: Extend sessions beyond 15 minutes
- Session resumption: Resume sessions across connection resets
- GoAway messages: Receive warnings before connection termination
connectMusic(callbacks, apiKey?)
Connect to Lyria RealTime music generation session for real-time streaming music.
⚠️ Experimental: Lyria RealTime is an experimental model.
⚠️ Requires v1alpha API: This feature requires the v1alpha API version.
const session = await toolkit.connectMusic({
onmessage: async (message) => {
// Process audio chunks (16-bit PCM, 48kHz, stereo)
if (message.serverContent?.audioChunks) {
for (const chunk of message.serverContent.audioChunks) {
const audioBuffer = Buffer.from(chunk.data, 'base64');
// Play audio...
}
}
},
onerror: (error) => console.error('Error:', error),
onclose: () => console.log('Session closed')
});
// Set initial prompts
await session.setWeightedPrompts({
weightedPrompts: [
{ text: 'minimal techno', weight: 1.0 },
{ text: 'deep bass', weight: 0.5 }
]
});
// Set generation config
await session.setMusicGenerationConfig({
musicGenerationConfig: {
bpm: 90,
temperature: 1.0,
density: 0.7,
brightness: 0.6,
scale: 'C_MAJOR_A_MINOR',
audioFormat: 'pcm16',
sampleRateHz: 48000
}
});
// Start generating music
await session.play();
// Control playback
await session.pause();
await session.play();
await session.stop();
await session.resetContext();
// Update prompts in real-time
await session.setWeightedPrompts({
weightedPrompts: [
{ text: 'Piano', weight: 2.0 },
{ text: 'Meditation', weight: 0.5 }
]
});
// Update config (reset context for BPM/scale changes)
await session.setMusicGenerationConfig({
musicGenerationConfig: {
bpm: 120,
scale: 'D_MAJOR_B_MINOR'
}
});
await session.resetContext();Callbacks:
onmessage(message): Called when audio chunks or other messages are receivedonerror(error): Called when an error occursonclose(): Called when the session closes
Session Methods:
setWeightedPrompts({ weightedPrompts }): Set or update music promptssetMusicGenerationConfig({ musicGenerationConfig }): Set or update generation configplay(): Start/resume music generationpause(): Pause music generationstop(): Stop music generationresetContext(): Reset context (required after BPM/scale changes)
Music Generation Config:
guidance(0.0-6.0, default: 4.0): How strictly model follows promptsbpm(60-200): Beats Per Minutedensity(0.0-1.0): Density of musical notes/soundsbrightness(0.0-1.0): Tonal qualityscale(MusicScale): Musical scale/keymuteBass(boolean): Mute bass outputmuteDrums(boolean): Mute drums outputonlyBassAndDrums(boolean): Only output bass and drumsmusicGenerationMode('QUALITY' | 'DIVERSITY' | 'VOCALIZATION'): Generation modetemperature(0.0-3.0, default: 1.1): Temperature settingtopK(1-1000, default: 40): Top K samplingseed(0-2147483647): Random seedaudioFormat(string, default: 'pcm16'): Audio formatsampleRateHz(number, default: 48000): Sample rate
Audio Format:
- Output: Raw 16-bit PCM Audio
- Sample rate: 48kHz
- Channels: 2 (stereo)
Note:
- Prompts are checked by safety filters
- Output audio is watermarked
- Model generates instrumental music only
- Implement robust audio buffering for smooth playback
groundWithSearch(prompt, model?)
Get answers grounded in Google Search.
const result = await toolkit.groundWithSearch(
'What are the latest AI developments?',
'gemini-2.5-pro'
);
console.log(result.text);
console.log(result.candidates); // CitationsReturns: GroundedResult with text and candidates
groundWithMaps(prompt, location?, model?)
Get location-based information.
const result = await toolkit.groundWithMaps(
'Find nearby coffee shops',
{ latitude: 37.7749, longitude: -122.4194 },
'gemini-2.5-pro'
);Returns: GroundedResult
URL Context Methods
generateWithUrlContext(prompt, model?)
Generate text with URL Context tool enabled, allowing the model to access content from URLs.
// Basic usage - URLs in prompt
const result = await toolkit.generateWithUrlContext(
'Compare the ingredients from https://example.com/recipe1 and https://example.com/recipe2'
);
console.log(result.text);
// Access URL retrieval metadata
const urlMetadata = result.candidates?.[0]?.urlContextMetadata;
if (urlMetadata?.urlMetadata) {
urlMetadata.urlMetadata.forEach((meta) => {
console.log(`URL: ${meta.retrievedUrl}`);
console.log(`Status: ${meta.urlRetrievalStatus}`);
});
}Parameters:
prompt(string): The prompt containing URLs to analyze (URLs should be in the prompt text)model(string, optional): Model name (default:'gemini-2.5-flash')
Returns: Promise<GroundedResult> - Results with URL metadata
Limitations:
- Up to 20 URLs per request
- Maximum 34MB per URL
- URLs must be publicly accessible (no login/paywall)
- Supported content types: HTML, JSON, PDF, images (PNG, JPEG, BMP, WebP)
Use Cases:
- Extract data from multiple URLs
- Compare documents, articles, or reports
- Synthesize content from several sources
- Analyze code and documentation from GitHub
generateWithUrlContextAndSearch(prompt, model?)
Generate text with both URL Context and Google Search tools enabled.
const result = await toolkit.generateWithUrlContextAndSearch(
'Find the latest AI developments and analyze https://example.com/ai-report'
);Use Cases:
- Search the web and then analyze specific URLs in depth
- Combine broad search with detailed URL analysis
- Get comprehensive answers using both tools
Parameters:
prompt(string): The prompt containing URLs and/or search queriesmodel(string, optional): Model name (default:'gemini-2.5-flash')
Returns: Promise<GroundedResult> - Combined results
generateWithThinking(prompt, thinkingBudget?, model?)
Generate text with extended thinking capabilities.
const result = await toolkit.generateWithThinking(
'Solve this complex problem step by step...',
32768, // Thinking budget
'gemini-2.5-pro'
);Parameters:
prompt(string): The problem to solvethinkingBudget(number): Tokens for thinking (default: 32768)model(string): Model name (default:'gemini-2.5-pro')
Files API Methods
uploadFile(filePath, config?)
Upload a file using the Files API. Use when request size exceeds 20MB or for reusable file references.
// Node.js
const file = await toolkit.uploadFile('document.pdf', {
displayName: 'My Document',
mimeType: 'application/pdf'
});
// Browser
const fileInput = document.querySelector('input[type="file"]');
const file = await toolkit.uploadFile(fileInput.files[0], {
displayName: 'My Document'
});
// Use in generateText
const result = await toolkit.generateText('Describe this document', {
files: [file]
});Parameters:
filePath(string | File | Blob): Path to file (Node.js) or File/Blob (browser)config(UploadFileConfig | string, optional): Configuration or display name
Returns: Promise<FileObject> - Uploaded file with metadata
getFile(fileName)
Get metadata for an uploaded file.
const file = await toolkit.uploadFile('document.pdf');
const metadata = await toolkit.getFile(file.name);
console.log(metadata.state); // 'ACTIVE' or 'PROCESSING'Returns: Promise<FileObject>
listFiles(pageSize?)
List all uploaded files.
const files = await toolkit.listFiles(10);
for await (const file of files) {
console.log(file.name, file.displayName);
}Returns: Promise<Iterable<FileObject>>
deleteFile(fileName)
Delete an uploaded file.
await toolkit.deleteFile('files/my-file-123');Returns: Promise<void>
Context Caching Methods
createCache(model, config)
Create a cache for context caching to reduce costs on repeated requests.
const videoFile = await toolkit.uploadFile('movie.mp4');
// Wait for processing
while (videoFile.state !== 'ACTIVE') {
await new Promise(resolve => setTimeout(resolve, 2000));
videoFile = await toolkit.getFile(videoFile.name);
}
const cache = await toolkit.createCache('gemini-2.0-flash-001', {
displayName: 'movie-analysis-cache',
systemInstruction: 'You are an expert video analyzer.',
contents: [videoFile],
ttl: '300s' // 5 minutes
});
// Use cache
const result = await toolkit.generateText('Describe the characters', {
cachedContent: cache.name
});Parameters:
model(string): Model name (must use explicit version likegemini-2.0-flash-001)config(CreateCacheConfig): Cache configurationdisplayName(string, optional): Display namesystemInstruction(string, optional): System instruction to cachecontents(unknown[], optional): Contents to cachettl(string | number, optional): Time to live (e.g.,'300s'or300)expireTime(Date | string, optional): Expiration time
Returns: Promise<CachedContent>
Note: Minimum 2,048 tokens (2.5 Flash) or 4,096 tokens (2.5 Pro). Cached tokens billed at reduced rate.
listCaches()
List all cached content objects.
const caches = await toolkit.listCaches();
for await (const cache of caches) {
console.log(cache.name, cache.displayName);
}Returns: Promise<Iterable<CachedContent>>
getCache(cacheName)
Get metadata for a cached content object.
const cache = await toolkit.getCache('cachedContents/my-cache-123');
console.log(cache.expireTime);Returns: Promise<CachedContent>
updateCache(cacheName, config)
Update a cache's TTL or expiration time.
await toolkit.updateCache(cache.name, { ttl: '600s' });Returns: Promise<CachedContent>
deleteCache(cacheName)
Delete a cached content object.
await toolkit.deleteCache('cachedContents/my-cache-123');Returns: Promise<void>
Token Counting Methods
countTokens(contents, model?)
Count tokens for any content before sending to the API.
// Count text tokens
const count = await toolkit.countTokens('Hello, world!');
console.log(count.totalTokens);
// Count tokens for file + text
const file = await toolkit.uploadFile('image.jpg');
const count = await toolkit.countTokens(['Describe this image', file]);
// Count chat history
const chat = toolkit.createChat();
await chat.sendMessage({ message: 'Hello' });
const count = await toolkit.countTokens(chat.getHistory());Parameters:
contents(unknown): Content to count (text, files, chat history, etc.)model(string, optional): Model name (default:'gemini-2.5-flash')
Returns: Promise<TokenCount> - Token count result
Note: 1 token ≈ 4 characters, 100 tokens ≈ 60-80 words. Images: 258 tokens (2.0) or variable. Video: 263 tokens/sec. Audio: 32 tokens/sec.
Ephemeral Token Methods
createEphemeralToken(config?)
Create an ephemeral token for secure Live API access from client-side applications.
⚠️ Server-side only - Call this from your backend, not client-side.
// Server-side: Create ephemeral token
const token = await toolkit.createEphemeralToken({
uses: 1, // Token can only be used once
expireTime: new Date(Date.now() + 30 * 60 * 1000), // 30 minutes
newSessionExpireTime: new Date(Date.now() + 60 * 1000), // 1 minute
liveConnectConstraints: {
model: 'gemini-2.5-flash-native-audio-preview-09-2025',
config: {
temperature: 0.7,
responseModalities: ['AUDIO']
}
}
});
// Send token.name to client
// Client uses token.name as API key for connectLive()Options:
uses(number): Number of times token can be used (default: 1)expireTime(Date | string): Token expiration (default: 30 minutes)newSessionExpireTime(Date | string): New session expiration (default: 1 minute)liveConnectConstraints(object): Lock token to specific config
Returns: EphemeralToken with name property (use as API key)
File Search (RAG) Methods
createFileSearchStore(displayName?)
Create a new File Search store for RAG.
const store = await toolkit.createFileSearchStore('my-documents');
console.log(store.name); // Use this for uploads and queriesParameters:
displayName(string, optional): Display name for the store
Returns: Promise<FileSearchStore> - Created File Search store
listFileSearchStores()
List all File Search stores.
const stores = toolkit.listFileSearchStores();
for await (const store of stores) {
console.log(store.name, store.displayName);
}Returns: AsyncIterable<FileSearchStore> - Iterable of File Search stores
getFileSearchStore(name)
Get a specific File Search store by name.
const store = await toolkit.getFileSearchStore('fileSearchStores/my-store-123');Parameters:
name(string): Store name (e.g.,'fileSearchStores/my-store-123')
Returns: Promise<FileSearchStore> - File Search store
deleteFileSearchStore(name, force?)
Delete a File Search store.
await toolkit.deleteFileSearchStore('fileSearchStores/my-store-123', true);Parameters:
name(string): Store name to deleteforce(boolean): Force delete (default:true)
uploadToFileSearchStore(filePath, fileSearchStoreName, config?)
Upload a file directly to a File Search store (combines upload and import).
const operation = await toolkit.uploadToFileSearchStore(
'document.pdf',
store.name,
{
displayName: 'My Document',
customMetadata: [
{ key: 'author', stringValue: 'Robert Graves' },
{ key: 'year', numericValue: 1934 }
],
chunkingConfig: {
whiteSpaceConfig: {
maxTokensPerChunk: 200,
maxOverlapTokens: 20
}
}
}
);
// Poll for completion
while (!operation.done) {
await new Promise(resolve => setTimeout(resolve, 5000));
operation = await toolkit.getClient().operations.get({ operation });
}Parameters:
filePath(string): Path to the file to uploadfileSearchStoreName(string): Name of the File Search storeconfig(FileSearchUploadConfig, optional): Upload configurationdisplayName(string, optional): Display name for the filecustomMetadata(FileMetadata[], optional): Custom metadatachunkingConfig(ChunkingConfig, optional): Chunking configuration
Returns: Promise<Operation> - Operation that can be polled for completion
Files API Methods
uploadFile(filePath, config?)
Upload a file using the Files API. Use this when request size exceeds 20MB or for reusable file references.
// Node.js
const file = await toolkit.uploadFile('document.pdf', {
displayName: 'My Document',
mimeType: 'application/pdf'
});
// Browser
const fileInput = document.querySelector('input[type="file"]');
const file = await toolkit.uploadFile(fileInput.files[0], {
displayName: 'My Document'
});
// Use file in generateText
const result = await toolkit.generateText('Describe this document', {
files: [file]
});Parameters:
filePath(string | File | Blob): Path to file (Node.js) or File/Blob (browser)config(UploadFileConfig | string, optional): Configuration or display namedisplayName(string, optional): Display name for the filemimeType(string, optional): MIME type (auto-detected if not provided)
Returns: Promise<FileObject> - Uploaded file with metadata
Note: Files are automatically deleted after 48 hours. Use for files larger than 20MB or when you need to reuse files across multiple requests.
getFile(fileName)
Get metadata for an uploaded file.
const file = await toolkit.uploadFile('document.pdf');
const metadata = await toolkit.getFile(file.name);
console.log(metadata.state); // 'ACTIVE' or 'PROCESSING'
console.log(metadata.sizeBytes);
console.log(metadata.expireTime);Parameters:
fileName(string): Name of the file (from uploadFile response)
Returns: Promise<FileObject> - File metadata
listFiles(pageSize?)
List all uploaded files.
const files = await toolkit.listFiles(10);
for await (const file of files) {
console.log(file.name, file.displayName, file.state);
}Parameters:
pageSize(number, optional): Page size for pagination (max: 100)
Returns: Promise<Iterable<FileObject>> - Iterable of files
deleteFile(fileName)
Delete an uploaded file.
const file = await toolkit.uploadFile('document.pdf');
await toolkit.deleteFile(file.name);Parameters:
fileName(string): Name of the file to delete
Returns: Promise<void>
importFileToFileSearchStore(fileSearchStoreName, fileName, config?)
Import an existing file into a File Search store.
const operation = await toolkit.importFileToFileSearchStore(
store.name,
uploadedFile.name,
{
customMetadata: [
{ key: 'author', stringValue: 'Robert Graves' }
],
chunkingConfig: {
whiteSpaceConfig: {
maxTokensPerChunk: 200,
maxOverlapTokens: 20
}
}
}
);Parameters:
fileSearchStoreName(string): Name of the File Search storefileName(string): Name of the file (from Files API)config(FileSearchImportConfig, optional): Import configuration
Returns: Promise<Operation> - Operation that can be polled for completion
queryWithFileSearch(prompt, config, model?)
Query documents with File Search (RAG) to get answers grounded in uploaded documents.
// Basic query
const result = await toolkit.queryWithFileSearch(
'Tell me about Robert Graves',
{ fileSearchStoreNames: [store.name] }
);
console.log(result.text);
// Query with metadata filter
const filteredResult = await toolkit.queryWithFileSearch(
"Tell me about the book 'I, Claudius'",
{
fileSearchStoreNames: [store.name],
metadataFilter: 'author="Robert Graves"'
}
);
// Access citations
const citations = result.candidates?.[0]?.groundingMetadata;
console.log(citations);Parameters:
prompt(string): The query or promptconfig(FileSearchQueryConfig): File Search configurationfileSearchStoreNames(string[]): Array of File Search store namesmetadataFilter(string, optional): Metadata filter (e.g.,'author="Robert Graves"')
model(string, optional): Model name (default:'gemini-2.5-flash')
Returns: Promise<GroundedResult> - Query results with citations
🎨 Presets
79 ready-to-use preset configurations covering all use cases. No configuration needed - just use the preset!
Text Presets (22 presets)
import { generateText, presets } from 'gemini-ai-toolkit';
// Speed presets
const fast = await generateText('Quick answer', presets.text.fast);
const smart = await generateText('Complex question', presets.text.smart);
// Style presets
const creative = await generateText('Story', presets.text.creative);
const concise = await generateText('Summary', presets.text.concise);
const detailed = await generateText('Analysis', presets.text.detailed);
const balanced = await generateText('Answer', presets.text.balanced);
// Use case presets
const code = await generateText('Write function', presets.text.code);
const qa = await generateText('What is AI?', presets.text.qa);
const technical = await generateText('Explain API', pr