gemini-ai-toolkit

v1.4.0

Published

4 months ago

A comprehensive toolkit for the Google Gemini API, providing easy-to-use interfaces for text, chat, image, video, audio, and grounding features with all the latest Gemini models.

Downloads

0High
0Medium
0Low

base83

gemini google ai artificial-intelligence llm text-generation image-generation video-generation speech-synthesis chat imagen veo

🤖 Gemini AI Toolkit

The most comprehensive, developer-friendly toolkit for Google's Gemini API

Features • Quick Start • Documentation • Examples • API Reference • Presets • Contributing

📋 Table of Contents

🎯 Overview

Gemini AI Toolkit is a production-ready, TypeScript-first npm package that provides a clean, intuitive interface to Google's powerful Gemini API. Built with developer experience in mind, it offers:

🚀 One-line functions for minimal code usage
🎨 79 preset configurations for common use cases
🛠️ Developer-friendly utilities for file operations
🤖 Smart helpers with auto-detection and auto-retry
📚 File Search (RAG) for querying your documents
📦 Zero dependencies (only @google/genai as peer dependency)
🔒 Full TypeScript support with strict type checking
⚡ Auto API key detection from environment variables
🎯 Comprehensive error handling with helpful messages
📚 24 detailed examples covering all features

Why Gemini AI Toolkit?

| Feature | This Package | Others | |---------|-------------|--------| | Code Required | 1 line | 3-5 lines | | Presets | 79 ready-to-use | Manual config | | Type Safety | 100% TypeScript | Partial | | Utilities | Built-in | External libs | | Error Messages | Actionable tips | Generic | | Documentation | Comprehensive | Basic | | Examples | 24 examples | Few/None |

✨ Features

🎯 Core Capabilities

| Feature | Description | Models Supported | |---------|-------------|------------------| | 📝 Text Generation | Generate text with latest Gemini models | gemini-2.5-flash, gemini-2.5-pro | | 💬 Chat Conversations | Create and manage chat sessions with context | All text models | | 🖼️ Image Generation | Generate images with Imagen 4.0 | imagen-4.0-generate-001 | | 🎨 Image Editing | Edit images with text prompts | Imagen models | | 🔍 Image Understanding | Analyze and understand image content | gemini-2.5-flash-image | | 🎬 Video Generation | Generate videos from images with Veo 3.1 | veo-3.1-fast-generate-preview | | 📹 Video Understanding | Analyze video content frame-by-frame | All vision models | | 🔊 Text-to-Speech | Convert text to natural speech | gemini-2.5-flash-preview-tts | | 🎤 Live Conversations | Real-time audio conversations | gemini-2.5-flash-native-audio-preview-09-2025 | | 🌐 Grounded Search | Get up-to-date answers from Google Search | All text models | | 🗺️ Grounded Maps | Find location-based information | All text models | | 📚 File Search (RAG) | Query your documents with Retrieval Augmented Generation | gemini-2.5-flash, gemini-2.5-pro | | 🔗 URL Context | Analyze content from web pages, PDFs, and URLs | gemini-2.5-flash, gemini-2.5-pro | | 🧠 Thinking Mode | Tackle complex problems with extended thinking | gemini-2.5-pro | | 📁 Files API | Upload, manage, and use media files (images, videos, audio, documents) | All multimodal models | | 💾 Context Caching | Cache content to reduce costs on repeated requests | gemini-2.0-flash-001, gemini-2.5-flash, gemini-2.5-pro | | 🔢 Token Counting | Count tokens for any content before sending to API | All models | | 🎵 Lyria RealTime | Real-time streaming music generation with interactive control | models/lyria-realtime-exp (experimental) |

🎁 Developer Experience Features

⚡ Quick Functions: One-liner functions for common operations
🎨 79 Presets: Pre-configured options for all use cases
🛠️ Utilities: File operations, batch processing, streaming helpers
🔄 Auto-initialization: Automatic API key detection
📦 Minimal Dependencies: Only 1 production dependency
🎯 Type Safety: Full TypeScript support with strict mode
📚 Comprehensive Docs: Detailed documentation and examples

📦 Installation

Prerequisites

Node.js >= 18.0.0
npm or yarn or pnpm
Google Gemini API Key (Get one here)

Install

# Using npm
npm install gemini-ai-toolkit

# Using yarn
yarn add gemini-ai-toolkit

# Using pnpm
pnpm add gemini-ai-toolkit

Get Your API Key

Visit Google AI Studio
Sign in with your Google account
Create a new API key
Copy and store it securely

🚀 Quick Start

Option 1: Service-Based Architecture (Recommended) 🏗️

Perfect for applications requiring maintainable, modular code

import { GeminiToolkit, CoreAIService, ChatService, GroundingService } from 'gemini-ai-toolkit';

// Initialize toolkit once
const toolkit = new GeminiToolkit({
  apiKey: 'your-api-key-here' // or set GEMINI_API_KEY env var
});

// Access service instances directly
const { coreAI, chat, grounding, fileSearch, files, cache, tokens } = toolkit;

// Core AI operations
const text = await coreAI.generateText('Explain quantum computing in simple terms');
const image = await coreAI.generateImage('A futuristic robot in a cyberpunk city');

// Chat conversations
const chatSession = chat.createChat('gemini-2.5-pro');
const response = await chatSession.sendMessage({ message: 'Hello!' });

// Grounded search
const searchResults = await grounding.groundWithSearch('Latest AI developments in 2024');

// File Search (RAG) - query your documents
const store = await fileSearch.createFileSearchStore('my-documents');
const operation = await fileSearch.uploadToFileSearchStore('document.pdf', store.name);
// Wait for operation.done, then query:
const answer = await fileSearch.queryWithFileSearch('Tell me about X', {
  fileSearchStoreNames: [store.name]
});

// Files API - upload and use files
const file = await files.uploadFile('image.jpg', { displayName: 'My Image' });
const analysis = await coreAI.generateText('Describe this image', { files: [file] });

// Context Caching - reduce costs on repeated requests
const cacheObj = await cache.createCache('gemini-2.0-flash-001', {
  systemInstruction: 'You are a helpful assistant.',
  contents: [file],
  ttl: '300s'
});
const cachedResult = await coreAI.generateText('What is this?', { cachedContent: cacheObj.name });

// Token Counting - estimate costs
const tokenCount = await tokens.countTokens('Hello, world!');
console.log(`Tokens: ${tokenCount.totalTokens}`);

// Live conversations with ephemeral tokens
const token = await chat.createEphemeralToken({
  uses: 1,
  expireTime: new Date(Date.now() + 30 * 60 * 1000) // 30 minutes
});
const liveSession = await chat.connectLive({
  onmessage: async (message) => console.log('Received:', message),
  onerror: (error) => console.error('Error:', error),
  onclose: () => console.log('Session closed')
}, {}, token.name);

// Lyria RealTime music generation (experimental, requires v1alpha)
const musicSession = await chat.connectMusic({
  onmessage: async (message) => {
    if (message.serverContent?.audioChunks) {
      // Process audio chunks (16-bit PCM, 48kHz, stereo)
    }
  },
  onerror: (error) => console.error('Error:', error),
  onclose: () => console.log('Session closed')
});
await musicSession.setWeightedPrompts({
  weightedPrompts: [{ text: 'minimal techno', weight: 1.0 }]
});
await musicSession.setMusicGenerationConfig({
  musicGenerationConfig: { bpm: 90, temperature: 1.0 }
});
await musicSession.play();

Option 2: One-Line Functions ⚡

Perfect for quick scripts and minimal code usage

import { generateText, generateImage, search, queryWithUrlContext, createFileSearchStore, uploadToFileSearchStore, queryFileSearch } from 'gemini-ai-toolkit';

// Set GEMINI_API_KEY environment variable
// export GEMINI_API_KEY="your-api-key-here"

// One line - that's it!
const text = await generateText('Explain quantum computing in simple terms');
const image = await generateImage('A futuristic robot in a cyberpunk city');
const results = await search('Latest AI developments in 2024');

// File Search (RAG) - query your documents
const store = await createFileSearchStore('my-documents');
const operation = await uploadToFileSearchStore('document.pdf', store.name);
// Wait for operation.done, then query:
const answer = await queryFileSearch('Tell me about X', {
  fileSearchStoreNames: [store.name]
});

// Files API - upload and use files
const file = await uploadFile('image.jpg', { displayName: 'My Image' });
const result = await generateText('Describe this image', { files: [file] });

// Context Caching - reduce costs on repeated requests
const cache = await createCache('gemini-2.0-flash-001', {
  systemInstruction: 'You are a helpful assistant.',
  contents: [file],
  ttl: 300
});
const cachedResult = await generateText('What is this?', { cachedContent: cache.name });

// Token Counting - estimate costs
const tokenCount = await countTokens('Hello, world!');
console.log(`Tokens: ${tokenCount.totalTokens}`);

// Lyria RealTime - generate music (experimental, requires v1alpha)
const musicSession = await connectMusic({
  onmessage: async (message) => {
    if (message.serverContent?.audioChunks) {
      // Process audio chunks (16-bit PCM, 48kHz, stereo)
    }
  },
  onerror: (error) => console.error('Error:', error),
  onclose: () => console.log('Session closed')
});
await musicSession.setWeightedPrompts({
  weightedPrompts: [{ text: 'minimal techno', weight: 1.0 }]
});
await musicSession.setMusicGenerationConfig({
  musicGenerationConfig: { bpm: 90, temperature: 1.0 }
});
await musicSession.play();

Option 3: Initialize Once, Use Everywhere

Best for applications with multiple API calls

import { init, generateText, generateImage } from 'gemini-ai-toolkit';

// Initialize once at app startup
init('your-api-key-here');

// Now use anywhere - no API key needed!
const text1 = await generateText('First prompt');
const text2 = await generateText('Second prompt');
const image = await generateImage('A robot');

Option 3: Full Class API (Maximum Control)

Best for complex applications needing fine-grained control

import { GeminiToolkit } from 'gemini-ai-toolkit';

const toolkit = new GeminiToolkit({ 
  apiKey: 'your-api-key-here' 
});

const text = await toolkit.generateText('Hello, world!');
const chat = toolkit.createChat('gemini-2.5-pro');

Environment Variable Setup

# Linux/macOS
export GEMINI_API_KEY="your-api-key-here"

# Windows (PowerShell)
$env:GEMINI_API_KEY="your-api-key-here"

# Windows (CMD)
set GEMINI_API_KEY=your-api-key-here

# Or use a .env file (recommended)
echo "GEMINI_API_KEY=your-api-key-here" > .env

💡 Examples

The toolkit includes comprehensive examples demonstrating different usage patterns. Run them with:

# List all available examples
npm run examples

# Run the basic service-based example (recommended)
npm run example:basic

# Run the advanced patterns example
npm run example:advanced

# Run the migration guide
npm run example:migration

Example Categories

examples/service-based-example.ts - New modular architecture (recommended)
examples/advanced-service-example.ts - Advanced patterns and real-world usage
examples/migration-example.ts - Migrating from monolithic to service-based

See examples/README.md for detailed documentation of all examples.

📚 API Reference

Quick Functions (One-Liners)

All quick functions automatically detect your API key from GEMINI_API_KEY environment variable or use the cached instance from init().

`generateText(prompt, options?, apiKey?)`

Generate text content with a single line of code.

import { generateText, presets } from 'gemini-ai-toolkit';

// Basic usage
const text = await generateText('What is artificial intelligence?');

// With options
const text = await generateText('Explain quantum computing', {
  model: 'gemini-2.5-pro',
  config: { temperature: 0.7, maxOutputTokens: 2000 }
});

// With preset
const text = await generateText('Quick answer', presets.text.fast);

// With explicit API key
const text = await generateText('Hello!', undefined, 'your-api-key');

Parameters:

prompt (string, required): The text prompt
options (GenerateTextOptions, optional): Configuration options
apiKey (string, optional): API key (overrides env var)

Returns: Promise<string> - Generated text

`generateImage(prompt, options?, apiKey?)`

Generate images with Imagen 4.0.

import { generateImage, presets, saveImage } from 'gemini-ai-toolkit';

// Basic usage
const imageBase64 = await generateImage('A robot with a skateboard');

// With preset
const imageBase64 = await generateImage('A landscape', presets.image.wide);

// Save to file
saveImage(imageBase64, 'output.png');

Parameters:

prompt (string, required): Image description
options (GenerateImageOptions, optional): Configuration options
apiKey (string, optional): API key

Returns: Promise<string> - Base64 encoded image

`createChat(model?, apiKey?)`

Create a chat session for conversational interactions.

import { createChat, presets } from 'gemini-ai-toolkit';

// Basic usage
const chat = createChat();

// With model
const chat = createChat('gemini-2.5-pro');

// With preset
const chat = createChat(presets.chat.professional);

// Use the chat
const response = await chat.sendMessage({ message: 'Hello!' });
console.log(response.text);

// Streaming
const stream = await chat.sendMessageStream({ message: 'Tell a story' });
for await (const chunk of stream) {
  process.stdout.write(chunk.text);
}

Returns: Chat instance

`generateSpeech(text, options?, apiKey?)`

Convert text to natural speech.

import { generateSpeech, saveAudio, presets } from 'gemini-ai-toolkit';

// Basic usage
const audioBase64 = await generateSpeech('Hello, world!');

// With preset
const audioBase64 = await generateSpeech('Welcome!', presets.speech.narration);

// Save to file
saveAudio(audioBase64, 'output.wav');

Returns: Promise<string> - Base64 encoded audio

`search(query, apiKey?)`

Search the web and get grounded answers.

import { search } from 'gemini-ai-toolkit';

const result = await search('Latest AI developments in 2024');
console.log(result.text);

Returns: Promise<GroundedResult> - Search results with citations

`findNearby(query, location, apiKey?)`

Find nearby places using Google Maps.

import { findNearby } from 'gemini-ai-toolkit';

const places = await findNearby('restaurants', {
  latitude: 37.7749,
  longitude: -122.4194
});
console.log(places.text);

Returns: Promise<GroundedResult> - Location-based results

`analyzeImage(imageBase64, prompt, mimeType, options?, apiKey?)`

Analyze image content.

import { analyzeImage, loadImage, presets } from 'gemini-ai-toolkit';

const imageBase64 = await loadImage('photo.jpg');
const analysis = await analyzeImage(
  imageBase64,
  'What is in this image?',
  'image/jpeg',
  presets.analysis.detailed
);

Returns: Promise<string> - Analysis text

`editImage(imageBase64, mimeType, prompt, apiKey?)`

Edit images with text prompts.

import { editImage, loadImage, saveImage } from 'gemini-ai-toolkit';

const imageBase64 = await loadImage('input.png');
const edited = await editImage(
  imageBase64,
  'image/png',
  'Add a sunset in the background'
);
saveImage(edited, 'output.png');

Returns: Promise<string> - Base64 encoded edited image

`init(apiKey)`

Initialize the toolkit once for use with quick functions.

import { init, generateText } from 'gemini-ai-toolkit';

// Initialize once
init('your-api-key-here');

// Now all quick functions work without passing API key
const text = await generateText('Hello!');

`getToolkit()`

Get the default toolkit instance.

import { getToolkit } from 'gemini-ai-toolkit';

const toolkit = getToolkit();
// Use toolkit methods directly

`queryFileSearch(prompt, config, model?, apiKey?)`

Query your documents with File Search (RAG) for accurate, context-aware answers.

import { queryFileSearch, createFileSearchStore, uploadToFileSearchStore } from 'gemini-ai-toolkit';

// Create a File Search store
const store = await createFileSearchStore('my-documents');

// Upload a file (wait for operation to complete)
const operation = await uploadToFileSearchStore('document.pdf', store.name);
// Poll operation.done until true...

// Query your documents
const result = await queryFileSearch('Tell me about Robert Graves', {
  fileSearchStoreNames: [store.name]
});
console.log(result.text);

Parameters:

prompt (string, required): The query or prompt
config (FileSearchQueryConfig, required): File Search configuration
- fileSearchStoreNames (string[], required): Array of File Search store names
- metadataFilter (string, optional): Metadata filter (e.g., 'author="Robert Graves"')
model (string, optional): Model name (default: 'gemini-2.5-flash')
apiKey (string, optional): API key

Returns: Promise<GroundedResult> - Query results with citations

`createFileSearchStore(displayName?, apiKey?)`

Create a new File Search store for RAG.

import { createFileSearchStore } from 'gemini-ai-toolkit';

const store = await createFileSearchStore('my-documents');
console.log(store.name); // Use this name for uploads and queries

Returns: Promise<FileSearchStore> - Created File Search store

`createEphemeralToken(config?, apiKey?)`

Create ephemeral token for secure Live API access (server-side only).

import { createEphemeralToken } from 'gemini-ai-toolkit';

// Server-side: Create token
const token = await createEphemeralToken({
  uses: 1,
  expireTime: new Date(Date.now() + 30 * 60 * 1000), // 30 minutes
  newSessionExpireTime: new Date(Date.now() + 60 * 1000), // 1 minute
  liveConnectConstraints: {
    model: 'gemini-2.5-flash-native-audio-preview-09-2025',
    config: {
      temperature: 0.7,
      responseModalities: ['AUDIO']
    }
  }
});
// Send token.name to client for use with connectLive()

Parameters:

config (EphemeralTokenConfig, optional): Token configuration
- uses (number, optional): Number of uses (default: 1)
- expireTime (Date | string, optional): Expiration (default: 30 minutes)
- newSessionExpireTime (Date | string, optional): New session expiration (default: 1 minute)
- liveConnectConstraints (object, optional): Lock token to specific config
apiKey (string, optional): API key

Returns: Promise<EphemeralToken> - Token with name property (use as API key)

Note: ⚠️ Server-side only. Ephemeral tokens enhance security for client-side Live API access.

Files API Quick Functions

`uploadFile(filePath, config?, apiKey?)`

Quick file upload - minimal code!

import { uploadFile } from 'gemini-ai-toolkit';

const file = await uploadFile('document.pdf', { displayName: 'My Document' });

Returns: Promise<FileObject>

`getFile(fileName, apiKey?)`

Quick file metadata retrieval - minimal code!

import { getFile } from 'gemini-ai-toolkit';

const metadata = await getFile('files/my-file-123');

Returns: Promise<FileObject>

`listFiles(pageSize?, apiKey?)`

Quick file listing - minimal code!

import { listFiles } from 'gemini-ai-toolkit';

const files = await listFiles(10);
for await (const file of files) {
  console.log(file.name);
}

Returns: Promise<Iterable<FileObject>>

`deleteFile(fileName, apiKey?)`

Quick file deletion - minimal code!

import { deleteFile } from 'gemini-ai-toolkit';

await deleteFile('files/my-file-123');

Returns: Promise<void>

Context Caching Quick Functions

`createCache(model, config, apiKey?)`

Quick cache creation - minimal code!

import { createCache, uploadFile } from 'gemini-ai-toolkit';

const file = await uploadFile('video.mp4');
const cache = await createCache('gemini-2.0-flash-001', {
  displayName: 'my-cache',
  contents: [file],
  ttl: 300 // 5 minutes
});

Returns: Promise<CachedContent>

`listCaches(apiKey?)`

Quick cache listing - minimal code!

import { listCaches } from 'gemini-ai-toolkit';

const caches = await listCaches();
for await (const cache of caches) {
  console.log(cache.name);
}

Returns: Promise<Iterable<CachedContent>>

`getCache(cacheName, apiKey?)`

Quick cache retrieval - minimal code!

import { getCache } from 'gemini-ai-toolkit';

const cache = await getCache('cachedContents/my-cache-123');

Returns: Promise<CachedContent>

`updateCache(cacheName, config, apiKey?)`

Quick cache update - minimal code!

import { updateCache } from 'gemini-ai-toolkit';

await updateCache('cachedContents/my-cache-123', { ttl: '600s' });

Returns: Promise<CachedContent>

`deleteCache(cacheName, apiKey?)`

Quick cache deletion - minimal code!

import { deleteCache } from 'gemini-ai-toolkit';

await deleteCache('cachedContents/my-cache-123');

Returns: Promise<void>

Token Counting Quick Functions

`countTokens(contents, model?, apiKey?)`

Quick token counting - minimal code!

import { countTokens } from 'gemini-ai-toolkit';

const count = await countTokens('Hello, world!');
console.log(count.totalTokens);

Returns: Promise<TokenCount>

`connectMusic(callbacks, apiKey?)`

Quick music session connection - minimal code!

import { connectMusic } from 'gemini-ai-toolkit';

const session = await connectMusic({
  onmessage: async (message) => {
    // Handle audio chunks
  },
  onerror: (error) => console.error('Error:', error),
  onclose: () => console.log('Closed')
});

Returns: Promise<MusicSession> - Music session object

Note: ⚠️ Experimental model, requires v1alpha API.

`uploadToFileSearchStore(filePath, fileSearchStoreName, config?, apiKey?)`

Upload a file directly to a File Search store (combines upload and import).

import { uploadToFileSearchStore } from 'gemini-ai-toolkit';

const operation = await uploadToFileSearchStore(
  'document.pdf',
  store.name,
  {
    displayName: 'My Document',
    customMetadata: [
      { key: 'author', stringValue: 'Robert Graves' },
      { key: 'year', numericValue: 1934 }
    ],
    chunkingConfig: {
      whiteSpaceConfig: {
        maxTokensPerChunk: 200,
        maxOverlapTokens: 20
      }
    }
  }
);

// Poll operation.done until true
while (!operation.done) {
  await new Promise(resolve => setTimeout(resolve, 5000));
  operation = await getClient().operations.get({ operation });
}

Parameters:

filePath (string, required): Path to the file to upload
fileSearchStoreName (string, required): Name of the File Search store
config (FileSearchUploadConfig, optional): Upload configuration
- displayName (string, optional): Display name for the file
- customMetadata (FileMetadata[], optional): Custom metadata
- chunkingConfig (ChunkingConfig, optional): Chunking configuration
apiKey (string, optional): API key

Returns: Promise<Operation> - Operation that can be polled for completion

`queryWithUrlContext(prompt, model?, apiKey?)`

Query content from URLs using the URL Context tool. URLs should be included in the prompt text.

import { queryWithUrlContext } from 'gemini-ai-toolkit';

const result = await queryWithUrlContext(
  'Compare the ingredients from https://example.com/recipe1 and https://example.com/recipe2'
);
console.log(result.text);

// Access URL retrieval metadata
const urlMetadata = result.candidates?.[0]?.urlContextMetadata;
console.log(urlMetadata);

Parameters:

prompt (string, required): The prompt containing URLs to analyze
model (string, optional): Model name (default: 'gemini-2.5-flash')
apiKey (string, optional): API key

Returns: Promise<GroundedResult> - Query results with URL metadata

Note: Up to 20 URLs can be processed per request. Maximum 34MB per URL.

`queryWithUrlContextAndSearch(prompt, model?, apiKey?)`

Query with both URL Context and Google Search tools enabled.

import { queryWithUrlContextAndSearch } from 'gemini-ai-toolkit';

const result = await queryWithUrlContextAndSearch(
  'Find AI trends and analyze https://example.com/ai-report'
);

Returns: Promise<GroundedResult> - Combined search and URL analysis results

Class API (GeminiToolkit)

For applications needing more control, use the class API:

📚 API Reference

Service-Based Architecture (Recommended)

The toolkit uses a modular service-based architecture for better maintainability and separation of concerns. Each service handles a specific domain of functionality.

import { GeminiToolkit, CoreAIService, ChatService, GroundingService } from 'gemini-ai-toolkit';

// Initialize toolkit once
const toolkit = new GeminiToolkit({
  apiKey: 'your-api-key-here'
});

// Access service instances
const { coreAI, chat, grounding, fileSearch, files, cache, tokens } = toolkit;

Constructor

new GeminiToolkit(config: GeminiToolkitConfig)

Config:

apiKey (string, required): Your Gemini API key

Service Properties

The GeminiToolkit class provides the following service instances:

coreAI: CoreAIService - Text, image, video, speech generation
chat: ChatService - Chat conversations, live sessions, ephemeral tokens
grounding: GroundingService - Google Search, Maps, URL context
fileSearch: FileSearchService - File Search (RAG) operations
files: FilesService - File upload/management operations
cache: CacheService - Context caching operations
tokens: TokenService - Token counting operations

CoreAIService

Handles core AI generation operations including text, images, videos, and speech.

// Text generation
const text = await coreAI.generateText('Explain quantum computing', {
  model: 'gemini-2.5-pro',
  config: { temperature: 0.7 }
});

// Image generation
const imageB64 = await coreAI.generateImage('A futuristic robot', {
  aspectRatio: '16:9',
  personGeneration: 'allow_adult'
});

// Video generation (from image)
const videoResult = await coreAI.generateVideo(imageB64, 'image/jpeg', 'Make it dance', {
  durationSeconds: 4,
  fps: 30
});

// Image editing
const editedImage = await coreAI.editImage(existingImageB64, 'image/jpeg', 'Add a hat');

// Media analysis
const analysis = await coreAI.analyzeMedia(imageB64, 'image/jpeg', 'What do you see?');

// Speech synthesis
const audioB64 = await coreAI.generateSpeech('Hello, world!', {
  voiceName: 'Puck',
  languageCode: 'en-US'
});

Methods:

generateText(prompt, options?) - Generate text content
generateImage(prompt, options?) - Generate images
editImage(imageB64, mimeType, prompt, model?) - Edit existing images
analyzeMedia(data, mimeType, prompt, options?) - Analyze images/videos/audio
generateVideo(imageB64, mimeType, prompt, options?) - Generate videos from images
generateSpeech(text, options?) - Generate speech audio

ChatService

Manages chat conversations, live sessions, and ephemeral tokens.

// Create chat sessions
const chat = chatService.createChat('gemini-2.5-pro');
const response = await chat.sendMessage({ message: 'Hello!' });

// Ephemeral tokens for live sessions
const token = await chatService.createEphemeralToken({
  uses: 1,
  expireTime: new Date(Date.now() + 30 * 60 * 1000) // 30 minutes
});

// Live conversation sessions
const liveSession = await chatService.connectLive({
  onmessage: async (message) => console.log('Received:', message),
  onerror: (error) => console.error('Error:', error),
  onclose: () => console.log('Session closed')
}, {
  model: 'gemini-2.0-flash-exp',
  responseModalities: ['text']
}, token.name);

// Music generation (experimental)
const musicSession = await chatService.connectMusic({
  onmessage: async (message) => {
    if (message.serverContent?.audioChunks) {
      // Process 16-bit PCM audio chunks at 48kHz stereo
    }
  },
  onerror: (error) => console.error('Error:', error),
  onclose: () => console.log('Session closed')
});

Methods:

createChat(model?) - Create a chat session
createEphemeralToken(config?) - Create ephemeral tokens for live sessions
connectLive(callbacks, options?, ephemeralToken?) - Start live conversation
connectMusic(callbacks, apiKey?) - Start music generation session

GroundingService

Provides grounding capabilities with Google Search, Maps, and URL context.

// Ground with Google Search
const searchResult = await grounding.groundWithSearch(
  'Latest developments in quantum computing',
  'gemini-2.5-pro'
);
console.log(searchResult.text); // Grounded response
console.log(searchResult.candidates[0].citationMetadata?.citations); // Citations

// Ground with Google Maps
const mapsResult = await grounding.groundWithMaps(
  'Find Italian restaurants near Central Park',
  { latitude: 40.7829, longitude: -73.9654 },
  'gemini-2.5-pro'
);

// Generate with URL context
const urlResult = await grounding.generateWithUrlContext(
  'Summarize the main points from this article',
  'gemini-2.5-pro'
);

// Combine URL context with search
const combinedResult = await grounding.generateWithUrlContextAndSearch(
  'Compare the information from the URL with current developments',
  'gemini-2.5-pro'
);

Methods:

groundWithSearch(prompt, model?) - Generate with Google Search grounding
groundWithMaps(prompt, location, model?) - Generate with Google Maps grounding
generateWithUrlContext(prompt, model?) - Generate with URL context
generateWithUrlContextAndSearch(prompt, model?) - Generate with URL context + search

FileSearchService

Manages File Search (Retrieval Augmented Generation) operations.

// Create a file search store
const store = await fileSearch.createFileSearchStore('my-documents');
console.log(`Store created: ${store.name}`);

// Upload files to the store
const operation = await fileSearch.uploadToFileSearchStore(
  'document.pdf',
  store.name,
  { mimeType: 'application/pdf' }
);

// Wait for processing to complete
// ... (polling logic)

// Query the store
const answer = await fileSearch.queryWithFileSearch(
  'What are the key findings?',
  {
    fileSearchStoreNames: [store.name],
    maxNumResults: 5,
    resultThreshold: 0.7
  },
  'gemini-2.5-pro'
);

// Import existing files
await fileSearch.importFileToFileSearchStore(
  store.name,
  'files/document.pdf',
  { mimeType: 'application/pdf' }
);

// List and manage stores
const stores = await fileSearch.listFileSearchStores();
const storeInfo = await fileSearch.getFileSearchStore(store.name);
await fileSearch.deleteFileSearchStore(store.name);

Methods:

createFileSearchStore(displayName?) - Create a new file search store
listFileSearchStores() - List all file search stores
getFileSearchStore(name) - Get store details
deleteFileSearchStore(name, force?) - Delete a store
uploadToFileSearchStore(file, storeName, config?, apiKey?) - Upload file to store
importFileToFileSearchStore(storeName, fileName, config?) - Import existing file
queryWithFileSearch(prompt, config, model?) - Query files with RAG

FilesService

Handles file upload, retrieval, listing, and deletion operations.

// Upload files
const file = await files.uploadFile('image.jpg', {
  displayName: 'My Image',
  mimeType: 'image/jpeg'
});
console.log(`Uploaded: ${file.name}`);

// Get file information
const fileInfo = await files.getFile(file.name);
console.log(`State: ${fileInfo.state}, Size: ${fileInfo.sizeBytes} bytes`);

// List files
const allFiles = await files.listFiles(10); // max 10 results
allFiles.files.forEach(f => console.log(`${f.name}: ${f.displayName}`));

// Delete files
await files.deleteFile(file.name);

Methods:

uploadFile(filePath, config?) - Upload a file
getFile(fileName) - Get file information
listFiles(pageSize?) - List uploaded files
deleteFile(fileName) - Delete a file

CacheService

Manages context caching for cost reduction on repeated requests.

// Create a cache
const cache = await cache.createCache('gemini-2.0-flash-001', {
  systemInstruction: 'You are a helpful assistant specializing in JavaScript.',
  contents: [
    {
      role: 'user',
      parts: [{ text: 'Explain closures in JavaScript.' }]
    },
    {
      role: 'model',
      parts: [{ text: 'Closures are...' }]
    }
  ],
  ttl: '3600s' // 1 hour
});

// Use cached content
const response = await coreAI.generateText(
  'Give me an example of a closure',
  { cachedContent: cache.name }
);

// List and manage caches
const caches = await cache.listCaches();
const cacheInfo = await cache.getCache(cache.name);
await cache.updateCache(cache.name, { ttl: '7200s' }); // Extend TTL
await cache.deleteCache(cache.name);

Methods:

createCache(model, config) - Create a new cache
listCaches() - List all caches
getCache(cacheName) - Get cache details
updateCache(cacheName, config) - Update cache settings
deleteCache(cacheName) - Delete a cache

TokenService

Provides token counting for cost estimation.

// Count tokens in text
const count = await tokens.countTokens('Hello, world!');
console.log(`Total tokens: ${count.totalTokens}`);

// Count tokens with model context
const countWithModel = await tokens.countTokens(
  'Explain quantum computing',
  'gemini-2.5-pro'
);

// Count tokens for multimodal content
const multimodalCount = await tokens.countTokens([
  { text: 'Describe this image:' },
  { inlineData: { mimeType: 'image/jpeg', data: imageBase64 } }
], 'gemini-2.5-pro');

Methods:

countTokens(contents, model?) - Count tokens in content

Legacy Direct Methods (Deprecated)

For backward compatibility, the GeminiToolkit class still provides direct methods, but these are deprecated. Use the service instances instead.

Deprecated Methods

`generateText(prompt, options?)`

Generate text content.

const text = await toolkit.generateText('Hello, world!', {
  model: 'gemini-2.5-pro',
  config: {
    temperature: 0.7,
    maxOutputTokens: 2000,
    topP: 0.95,
    topK: 40
  }
});

Options:

model (string): Model name (default: 'gemini-2.5-flash')
config (object): Additional model configuration

`createChat(model?)`

Create a chat session.

const chat = toolkit.createChat('gemini-2.5-pro');

// Send message
const response = await chat.sendMessage({ 
  message: 'Hello!' 
});

// Streaming
const stream = await chat.sendMessageStream({ 
  message: 'Tell a story' 
});

for await (const chunk of stream) {
  console.log(chunk.text);
}

Chat Methods:

sendMessage({ message }) - Send a message and get response
sendMessageStream({ message }) - Stream response chunks

`generateImage(prompt, options?)`

Generate images.

const imageBase64 = await toolkit.generateImage(
  'A futuristic city at sunset',
  {
    aspectRatio: '16:9',
    outputMimeType: 'image/png',
    numberOfImages: 1
  }
);

Options:

model (string): Model name (default: 'imagen-4.0-generate-001')
aspectRatio (ImageAspectRatio): '1:1', '16:9', '9:16', '4:3', '3:4'
numberOfImages (number): 1-4 (default: 1)
outputMimeType (string): 'image/png', 'image/jpeg', 'image/webp'

`editImage(imageBase64, mimeType, prompt, model?)`

Edit images.

const edited = await toolkit.editImage(
  imageBase64,
  'image/png',
  'Apply a retro 80s filter with warm tones'
);

`analyzeMedia(data, mimeType, prompt, options?)`

Analyze images, video frames, or audio.

// Single image
const analysis = await toolkit.analyzeMedia(
  imageBase64,
  'image/png',
  'What is in this image?'
);

// Multiple frames (video)
const frames = [frame1, frame2, frame3];
const analysis = await toolkit.analyzeMedia(
  frames,
  'image/jpeg',
  'Describe the video content',
  { isVideo: true }
);

Options:

model (string): Model name
isVideo (boolean): Set to true for video analysis

`generateVideo(imageBase64, mimeType, prompt, options?)`

Generate videos from images.

const operation = await toolkit.generateVideo(
  imageBase64,
  'image/png',
  'Make the scene come alive with gentle movement',
  {
    aspectRatio: '16:9',
    resolution: '1080p'
  }
);

// Poll for completion
// Note: Video generation is asynchronous

Options:

model (string): Model name (default: 'veo-3.1-fast-generate-preview')
aspectRatio (VideoAspectRatio): '16:9' or '9:16'
resolution (string): '720p' or '1080p'
numberOfVideos (number): 1

Returns: Operation object (poll for completion)

`generateSpeech(text, options?)`

Convert text to speech.

const audioBase64 = await toolkit.generateSpeech('Hello, world!', {
  voiceName: 'Kore',
  model: 'gemini-2.5-flash-preview-tts'
});

Options:

model (string): Model name
voiceName (string): 'Kore' or 'Zephyr'

`createEphemeralToken(config?)`

Create an ephemeral token for secure Live API access from client-side applications.

⚠️ Server-side only - Call this from your backend, not client-side.

// Server-side: Create ephemeral token
const token = await toolkit.createEphemeralToken({
  uses: 1, // Token can only be used once
  expireTime: new Date(Date.now() + 30 * 60 * 1000), // 30 minutes
  newSessionExpireTime: new Date(Date.now() + 60 * 1000), // 1 minute
  liveConnectConstraints: {
    model: 'gemini-2.5-flash-native-audio-preview-09-2025',
    config: {
      temperature: 0.7,
      responseModalities: ['AUDIO']
    }
  }
});

// Send token.name to client
// Client uses token.name as API key for connectLive()

Options:

uses (number): Number of times token can be used (default: 1)
expireTime (Date | string): Token expiration (default: 30 minutes)
newSessionExpireTime (Date | string): New session expiration (default: 1 minute)
liveConnectConstraints (object): Lock token to specific config

Returns: EphemeralToken with name property (use as API key)

`connectLive(callbacks, options?, ephemeralToken?)`

Connect to live conversation session. Can use standard API key or ephemeral token.

// Basic usage with standard API key
const session = await toolkit.connectLive({
  onopen: () => console.log('Connected'),
  onmessage: async (message) => {
    console.log('Received:', message);
  },
  onerror: (error) => console.error('Error:', error),
  onclose: () => console.log('Disconnected')
});

// Using ephemeral token (client-side, enhanced security)
const session = await toolkit.connectLive(
  {
    onopen: () => console.log('Connected'),
    onmessage: async (message) => {
      console.log('Received:', message);
    },
    onerror: (error) => console.error('Error:', error),
    onclose: () => console.log('Disconnected')
  },
  {}, // options
  ephemeralToken.name // Token from server
);

// With function calling
const session = await toolkit.connectLive({
  onopen: () => console.log('Connected'),
  onmessage: async (message) => {
    // Handle tool calls
    if (message.toolCall) {
      const functionResponses = [];
      for (const fc of message.toolCall.functionCalls) {
        functionResponses.push({
          id: fc.id,
          name: fc.name,
          response: { result: 'ok' }
        });
      }
      await session.sendToolResponse({ functionResponses });
    }
  },
  onerror: (error) => console.error('Error:', error),
  onclose: () => console.log('Disconnected')
}, {
  tools: [{
    functionDeclarations: [
      { name: 'turn_on_lights' },
      { name: 'turn_off_lights', behavior: 'NON_BLOCKING' }
    ]
  }]
});

// With Google Search
const session = await toolkit.connectLive({
  onopen: () => console.log('Connected'),
  onmessage: async (message) => {
    // Handle search results
    if (message.serverContent?.modelTurn?.parts) {
      for (const part of message.serverContent.modelTurn.parts) {
        if (part.executableCode) {
          console.log('Code:', part.executableCode.code);
        }
      }
    }
  },
  onerror: (error) => console.error('Error:', error),
  onclose: () => console.log('Disconnected')
}, {
  tools: [{ googleSearch: {} }]
});

// With session management
const session = await toolkit.connectLive({
  onopen: () => console.log('Connected'),
  onmessage: async (message) => {
    // Handle session resumption updates
    if (message.sessionResumptionUpdate?.newHandle) {
      // Save handle for resuming session
      const newHandle = message.sessionResumptionUpdate.newHandle;
    }
    
    // Handle GoAway message
    if (message.goAway) {
      console.log('Connection closing soon:', message.goAway.timeLeft);
    }
    
    // Handle generation complete
    if (message.serverContent?.generationComplete) {
      console.log('Generation complete');
    }
  },
  onerror: (error) => console.error('Error:', error),
  onclose: () => console.log('Disconnected')
}, {
  contextWindowCompression: { slidingWindow: {} },
  sessionResumption: { handle: previousSessionHandle }
});

// Send audio
await session.sendAudio(audioData);

// Send text
session.sendClientContent({ turns: 'Hello!', turnComplete: true });

// Close session
await session.close();

Callbacks:

onopen(): Called when connection opens
onmessage(message): Called when message received
- Check message.toolCall for function calls
- Check message.serverContent for model responses
- Check message.sessionResumptionUpdate for resumption tokens
- Check message.goAway for connection termination warnings
onerror(error): Called on error
onclose(event): Called when connection closes

Options:

model (string): Model name (default: Live model)
voiceName (string): Voice name (default: 'Zephyr')
responseModalities (Modality[]): Response modalities (default: ['AUDIO'])
tools (LiveTool[]): Tools to enable (function calling, Google Search)
inputAudioTranscription (boolean): Enable input audio transcription
outputAudioTranscription (boolean): Enable output audio transcription
contextWindowCompression (ContextWindowCompressionConfig): Enable compression for longer sessions
sessionResumption (SessionResumptionConfig): Configure session resumption
realtimeInputConfig (RealtimeInputConfig): Configure VAD settings
thinkingConfig (ThinkingConfig): Configure thinking budget
enableAffectiveDialog (boolean): Enable affective dialog (requires v1alpha)
proactivity (ProactivityConfig): Configure proactive audio
mediaResolution (MediaResolution): Set media resolution
temperature (number): Temperature setting

Ephemeral Token:

Pass ephemeralToken.name as third parameter for client-side security
Ephemeral tokens are short-lived and reduce security risks

Tool Use:

Function calling: Define functions in tools[].functionDeclarations
Google Search: Enable with tools: [{ googleSearch: {} }]
Handle tool calls in onmessage callback
Respond with session.sendToolResponse({ functionResponses })

Session Management:

Context window compression: Extend sessions beyond 15 minutes
Session resumption: Resume sessions across connection resets
GoAway messages: Receive warnings before connection termination

`connectMusic(callbacks, apiKey?)`

Connect to Lyria RealTime music generation session for real-time streaming music.

⚠️ Experimental: Lyria RealTime is an experimental model.

⚠️ Requires v1alpha API: This feature requires the v1alpha API version.

const session = await toolkit.connectMusic({
  onmessage: async (message) => {
    // Process audio chunks (16-bit PCM, 48kHz, stereo)
    if (message.serverContent?.audioChunks) {
      for (const chunk of message.serverContent.audioChunks) {
        const audioBuffer = Buffer.from(chunk.data, 'base64');
        // Play audio...
      }
    }
  },
  onerror: (error) => console.error('Error:', error),
  onclose: () => console.log('Session closed')
});

// Set initial prompts
await session.setWeightedPrompts({
  weightedPrompts: [
    { text: 'minimal techno', weight: 1.0 },
    { text: 'deep bass', weight: 0.5 }
  ]
});

// Set generation config
await session.setMusicGenerationConfig({
  musicGenerationConfig: {
    bpm: 90,
    temperature: 1.0,
    density: 0.7,
    brightness: 0.6,
    scale: 'C_MAJOR_A_MINOR',
    audioFormat: 'pcm16',
    sampleRateHz: 48000
  }
});

// Start generating music
await session.play();

// Control playback
await session.pause();
await session.play();
await session.stop();
await session.resetContext();

// Update prompts in real-time
await session.setWeightedPrompts({
  weightedPrompts: [
    { text: 'Piano', weight: 2.0 },
    { text: 'Meditation', weight: 0.5 }
  ]
});

// Update config (reset context for BPM/scale changes)
await session.setMusicGenerationConfig({
  musicGenerationConfig: {
    bpm: 120,
    scale: 'D_MAJOR_B_MINOR'
  }
});
await session.resetContext();

Callbacks:

onmessage(message): Called when audio chunks or other messages are received
onerror(error): Called when an error occurs
onclose(): Called when the session closes

Session Methods:

setWeightedPrompts({ weightedPrompts }): Set or update music prompts
setMusicGenerationConfig({ musicGenerationConfig }): Set or update generation config
play(): Start/resume music generation
pause(): Pause music generation
stop(): Stop music generation
resetContext(): Reset context (required after BPM/scale changes)

Music Generation Config:

guidance (0.0-6.0, default: 4.0): How strictly model follows prompts
bpm (60-200): Beats Per Minute
density (0.0-1.0): Density of musical notes/sounds
brightness (0.0-1.0): Tonal quality
scale (MusicScale): Musical scale/key
muteBass (boolean): Mute bass output
muteDrums (boolean): Mute drums output
onlyBassAndDrums (boolean): Only output bass and drums
musicGenerationMode ('QUALITY' | 'DIVERSITY' | 'VOCALIZATION'): Generation mode
temperature (0.0-3.0, default: 1.1): Temperature setting
topK (1-1000, default: 40): Top K sampling
seed (0-2147483647): Random seed
audioFormat (string, default: 'pcm16'): Audio format
sampleRateHz (number, default: 48000): Sample rate

Audio Format:

Output: Raw 16-bit PCM Audio
Sample rate: 48kHz
Channels: 2 (stereo)

Note:

Prompts are checked by safety filters
Output audio is watermarked
Model generates instrumental music only
Implement robust audio buffering for smooth playback

`groundWithSearch(prompt, model?)`

Get answers grounded in Google Search.

const result = await toolkit.groundWithSearch(
  'What are the latest AI developments?',
  'gemini-2.5-pro'
);

console.log(result.text);
console.log(result.candidates); // Citations

Returns: GroundedResult with text and candidates

`groundWithMaps(prompt, location?, model?)`

Get location-based information.

const result = await toolkit.groundWithMaps(
  'Find nearby coffee shops',
  { latitude: 37.7749, longitude: -122.4194 },
  'gemini-2.5-pro'
);

Returns: GroundedResult

URL Context Methods

`generateWithUrlContext(prompt, model?)`

Generate text with URL Context tool enabled, allowing the model to access content from URLs.

// Basic usage - URLs in prompt
const result = await toolkit.generateWithUrlContext(
  'Compare the ingredients from https://example.com/recipe1 and https://example.com/recipe2'
);
console.log(result.text);

// Access URL retrieval metadata
const urlMetadata = result.candidates?.[0]?.urlContextMetadata;
if (urlMetadata?.urlMetadata) {
  urlMetadata.urlMetadata.forEach((meta) => {
    console.log(`URL: ${meta.retrievedUrl}`);
    console.log(`Status: ${meta.urlRetrievalStatus}`);
  });
}

Parameters:

prompt (string): The prompt containing URLs to analyze (URLs should be in the prompt text)
model (string, optional): Model name (default: 'gemini-2.5-flash')

Returns: Promise<GroundedResult> - Results with URL metadata

Limitations:

Up to 20 URLs per request
Maximum 34MB per URL
URLs must be publicly accessible (no login/paywall)
Supported content types: HTML, JSON, PDF, images (PNG, JPEG, BMP, WebP)

Use Cases:

Extract data from multiple URLs
Compare documents, articles, or reports
Synthesize content from several sources
Analyze code and documentation from GitHub

`generateWithUrlContextAndSearch(prompt, model?)`

Generate text with both URL Context and Google Search tools enabled.

const result = await toolkit.generateWithUrlContextAndSearch(
  'Find the latest AI developments and analyze https://example.com/ai-report'
);

Use Cases:

Search the web and then analyze specific URLs in depth
Combine broad search with detailed URL analysis
Get comprehensive answers using both tools

Parameters:

prompt (string): The prompt containing URLs and/or search queries
model (string, optional): Model name (default: 'gemini-2.5-flash')

Returns: Promise<GroundedResult> - Combined results

`generateWithThinking(prompt, thinkingBudget?, model?)`

Generate text with extended thinking capabilities.

const result = await toolkit.generateWithThinking(
  'Solve this complex problem step by step...',
  32768, // Thinking budget
  'gemini-2.5-pro'
);

Parameters:

prompt (string): The problem to solve
thinkingBudget (number): Tokens for thinking (default: 32768)
model (string): Model name (default: 'gemini-2.5-pro')

Files API Methods

`uploadFile(filePath, config?)`

Upload a file using the Files API. Use when request size exceeds 20MB or for reusable file references.

// Node.js
const file = await toolkit.uploadFile('document.pdf', {
  displayName: 'My Document',
  mimeType: 'application/pdf'
});

// Browser
const fileInput = document.querySelector('input[type="file"]');
const file = await toolkit.uploadFile(fileInput.files[0], {
  displayName: 'My Document'
});

// Use in generateText
const result = await toolkit.generateText('Describe this document', {
  files: [file]
});

Parameters:

filePath (string | File | Blob): Path to file (Node.js) or File/Blob (browser)
config (UploadFileConfig | string, optional): Configuration or display name

Returns: Promise<FileObject> - Uploaded file with metadata

`getFile(fileName)`

Get metadata for an uploaded file.

const file = await toolkit.uploadFile('document.pdf');
const metadata = await toolkit.getFile(file.name);
console.log(metadata.state); // 'ACTIVE' or 'PROCESSING'

Returns: Promise<FileObject>

`listFiles(pageSize?)`

List all uploaded files.

const files = await toolkit.listFiles(10);
for await (const file of files) {
  console.log(file.name, file.displayName);
}

Returns: Promise<Iterable<FileObject>>

`deleteFile(fileName)`

Delete an uploaded file.

await toolkit.deleteFile('files/my-file-123');

Returns: Promise<void>

Context Caching Methods

`createCache(model, config)`

Create a cache for context caching to reduce costs on repeated requests.

const videoFile = await toolkit.uploadFile('movie.mp4');

// Wait for processing
while (videoFile.state !== 'ACTIVE') {
  await new Promise(resolve => setTimeout(resolve, 2000));
  videoFile = await toolkit.getFile(videoFile.name);
}

const cache = await toolkit.createCache('gemini-2.0-flash-001', {
  displayName: 'movie-analysis-cache',
  systemInstruction: 'You are an expert video analyzer.',
  contents: [videoFile],
  ttl: '300s' // 5 minutes
});

// Use cache
const result = await toolkit.generateText('Describe the characters', {
  cachedContent: cache.name
});

Parameters:

model (string): Model name (must use explicit version like gemini-2.0-flash-001)
config (CreateCacheConfig): Cache configuration
- displayName (string, optional): Display name
- systemInstruction (string, optional): System instruction to cache
- contents (unknown[], optional): Contents to cache
- ttl (string | number, optional): Time to live (e.g., '300s' or 300)
- expireTime (Date | string, optional): Expiration time

Returns: Promise<CachedContent>

Note: Minimum 2,048 tokens (2.5 Flash) or 4,096 tokens (2.5 Pro). Cached tokens billed at reduced rate.

`listCaches()`

List all cached content objects.

const caches = await toolkit.listCaches();
for await (const cache of caches) {
  console.log(cache.name, cache.displayName);
}

Returns: Promise<Iterable<CachedContent>>

`getCache(cacheName)`

Get metadata for a cached content object.

const cache = await toolkit.getCache('cachedContents/my-cache-123');
console.log(cache.expireTime);

Returns: Promise<CachedContent>

`updateCache(cacheName, config)`

Update a cache's TTL or expiration time.

await toolkit.updateCache(cache.name, { ttl: '600s' });

Returns: Promise<CachedContent>

`deleteCache(cacheName)`

Delete a cached content object.

await toolkit.deleteCache('cachedContents/my-cache-123');

Returns: Promise<void>

Token Counting Methods

`countTokens(contents, model?)`

Count tokens for any content before sending to the API.

// Count text tokens
const count = await toolkit.countTokens('Hello, world!');
console.log(count.totalTokens);

// Count tokens for file + text
const file = await toolkit.uploadFile('image.jpg');
const count = await toolkit.countTokens(['Describe this image', file]);

// Count chat history
const chat = toolkit.createChat();
await chat.sendMessage({ message: 'Hello' });
const count = await toolkit.countTokens(chat.getHistory());

Parameters:

contents (unknown): Content to count (text, files, chat history, etc.)
model (string, optional): Model name (default: 'gemini-2.5-flash')

Returns: Promise<TokenCount> - Token count result

Note: 1 token ≈ 4 characters, 100 tokens ≈ 60-80 words. Images: 258 tokens (2.0) or variable. Video: 263 tokens/sec. Audio: 32 tokens/sec.

Ephemeral Token Methods

`createEphemeralToken(config?)`

Create an ephemeral token for secure Live API access from client-side applications.

⚠️ Server-side only - Call this from your backend, not client-side.

// Server-side: Create ephemeral token
const token = await toolkit.createEphemeralToken({
  uses: 1, // Token can only be used once
  expireTime: new Date(Date.now() + 30 * 60 * 1000), // 30 minutes
  newSessionExpireTime: new Date(Date.now() + 60 * 1000), // 1 minute
  liveConnectConstraints: {
    model: 'gemini-2.5-flash-native-audio-preview-09-2025',
    config: {
      temperature: 0.7,
      responseModalities: ['AUDIO']
    }
  }
});

// Send token.name to client
// Client uses token.name as API key for connectLive()

Options:

uses (number): Number of times token can be used (default: 1)
expireTime (Date | string): Token expiration (default: 30 minutes)
newSessionExpireTime (Date | string): New session expiration (default: 1 minute)
liveConnectConstraints (object): Lock token to specific config

Returns: EphemeralToken with name property (use as API key)

File Search (RAG) Methods

`createFileSearchStore(displayName?)`

Create a new File Search store for RAG.

const store = await toolkit.createFileSearchStore('my-documents');
console.log(store.name); // Use this for uploads and queries

Parameters:

displayName (string, optional): Display name for the store

Returns: Promise<FileSearchStore> - Created File Search store

`listFileSearchStores()`

List all File Search stores.

const stores = toolkit.listFileSearchStores();
for await (const store of stores) {
  console.log(store.name, store.displayName);
}

Returns: AsyncIterable<FileSearchStore> - Iterable of File Search stores

`getFileSearchStore(name)`

Get a specific File Search store by name.

const store = await toolkit.getFileSearchStore('fileSearchStores/my-store-123');

Parameters:

name (string): Store name (e.g., 'fileSearchStores/my-store-123')

Returns: Promise<FileSearchStore> - File Search store

`deleteFileSearchStore(name, force?)`

Delete a File Search store.

await toolkit.deleteFileSearchStore('fileSearchStores/my-store-123', true);

Parameters:

name (string): Store name to delete
force (boolean): Force delete (default: true)

`uploadToFileSearchStore(filePath, fileSearchStoreName, config?)`

Upload a file directly to a File Search store (combines upload and import).

const operation = await toolkit.uploadToFileSearchStore(
  'document.pdf',
  store.name,
  {
    displayName: 'My Document',
    customMetadata: [
      { key: 'author', stringValue: 'Robert Graves' },
      { key: 'year', numericValue: 1934 }
    ],
    chunkingConfig: {
      whiteSpaceConfig: {
        maxTokensPerChunk: 200,
        maxOverlapTokens: 20
      }
    }
  }
);

// Poll for completion
while (!operation.done) {
  await new Promise(resolve => setTimeout(resolve, 5000));
  operation = await toolkit.getClient().operations.get({ operation });
}

Parameters:

filePath (string): Path to the file to upload
fileSearchStoreName (string): Name of the File Search store
config (FileSearchUploadConfig, optional): Upload configuration
- displayName (string, optional): Display name for the file
- customMetadata (FileMetadata[], optional): Custom metadata
- chunkingConfig (ChunkingConfig, optional): Chunking configuration

Returns: Promise<Operation> - Operation that can be polled for completion

Files API Methods

`uploadFile(filePath, config?)`

Upload a file using the Files API. Use this when request size exceeds 20MB or for reusable file references.

// Node.js
const file = await toolkit.uploadFile('document.pdf', { 
  displayName: 'My Document',
  mimeType: 'application/pdf'
});

// Browser
const fileInput = document.querySelector('input[type="file"]');
const file = await toolkit.uploadFile(fileInput.files[0], { 
  displayName: 'My Document' 
});

// Use file in generateText
const result = await toolkit.generateText('Describe this document', {
  files: [file]
});

Parameters:

filePath (string | File | Blob): Path to file (Node.js) or File/Blob (browser)
config (UploadFileConfig | string, optional): Configuration or display name
- displayName (string, optional): Display name for the file
- mimeType (string, optional): MIME type (auto-detected if not provided)

Returns: Promise<FileObject> - Uploaded file with metadata

Note: Files are automatically deleted after 48 hours. Use for files larger than 20MB or when you need to reuse files across multiple requests.

`getFile(fileName)`

Get metadata for an uploaded file.

const file = await toolkit.uploadFile('document.pdf');
const metadata = await toolkit.getFile(file.name);
console.log(metadata.state); // 'ACTIVE' or 'PROCESSING'
console.log(metadata.sizeBytes);
console.log(metadata.expireTime);

Parameters:

fileName (string): Name of the file (from uploadFile response)

Returns: Promise<FileObject> - File metadata

`listFiles(pageSize?)`

List all uploaded files.

const files = await toolkit.listFiles(10);
for await (const file of files) {
  console.log(file.name, file.displayName, file.state);
}

Parameters:

pageSize (number, optional): Page size for pagination (max: 100)

Returns: Promise<Iterable<FileObject>> - Iterable of files

`deleteFile(fileName)`

Delete an uploaded file.

const file = await toolkit.uploadFile('document.pdf');
await toolkit.deleteFile(file.name);

Parameters:

fileName (string): Name of the file to delete

Returns: Promise<void>

`importFileToFileSearchStore(fileSearchStoreName, fileName, config?)`

Import an existing file into a File Search store.

const operation = await toolkit.importFileToFileSearchStore(
  store.name,
  uploadedFile.name,
  {
    customMetadata: [
      { key: 'author', stringValue: 'Robert Graves' }
    ],
    chunkingConfig: {
      whiteSpaceConfig: {
        maxTokensPerChunk: 200,
        maxOverlapTokens: 20
      }
    }
  }
);

Parameters:

fileSearchStoreName (string): Name of the File Search store
fileName (string): Name of the file (from Files API)
config (FileSearchImportConfig, optional): Import configuration

Returns: Promise<Operation> - Operation that can be polled for completion

`queryWithFileSearch(prompt, config, model?)`

Query documents with File Search (RAG) to get answers grounded in uploaded documents.

// Basic query
const result = await toolkit.queryWithFileSearch(
  'Tell me about Robert Graves',
  { fileSearchStoreNames: [store.name] }
);
console.log(result.text);

// Query with metadata filter
const filteredResult = await toolkit.queryWithFileSearch(
  "Tell me about the book 'I, Claudius'",
  {
    fileSearchStoreNames: [store.name],
    metadataFilter: 'author="Robert Graves"'
  }
);

// Access citations
const citations = result.candidates?.[0]?.groundingMetadata;
console.log(citations);

Parameters:

prompt (string): The query or prompt
config (FileSearchQueryConfig): File Search configuration
- fileSearchStoreNames (string[]): Array of File Search store names
- metadataFilter (string, optional): Metadata filter (e.g., 'author="Robert Graves"')
model (string, optional): Model name (default: 'gemini-2.5-flash')

Returns: Promise<GroundedResult> - Query results with citations

🎨 Presets

79 ready-to-use preset configurations covering all use cases. No configuration needed - just use the preset!

Text Presets (22 presets)

import { generateText, presets } from 'gemini-ai-toolkit';

// Speed presets
const fast = await generateText('Quick answer', presets.text.fast);
const smart = await generateText('Complex question', presets.text.smart);

// Style presets
const creative = await generateText('Story', presets.text.creative);
const concise = await generateText('Summary', presets.text.concise);
const detailed = await generateText('Analysis', presets.text.detailed);
const balanced = await generateText('Answer', presets.text.balanced);

// Use case presets
const code = await generateText('Write function', presets.text.code);
const qa = await generateText('What is AI?', presets.text.qa);
const technical = await generateText('Explain API', pr

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

🤖 Gemini AI Toolkit

📋 Table of Contents

🎯 Overview

Why Gemini AI Toolkit?

✨ Features

🎯 Core Capabilities

🎁 Developer Experience Features

📦 Installation

Prerequisites

Install

Get Your API Key

🚀 Quick Start

Option 1: Service-Based Architecture (Recommended) 🏗️

Option 2: One-Line Functions ⚡

Option 3: Initialize Once, Use Everywhere

Option 3: Full Class API (Maximum Control)

Environment Variable Setup

💡 Examples

Example Categories

📚 API Reference

Quick Functions (One-Liners)

generateText(prompt, options?, apiKey?)

generateImage(prompt, options?, apiKey?)

createChat(model?, apiKey?)

generateSpeech(text, options?, apiKey?)

search(query, apiKey?)

findNearby(query, location, apiKey?)

analyzeImage(imageBase64, prompt, mimeType, options?, apiKey?)

editImage(imageBase64, mimeType, prompt, apiKey?)

init(apiKey)

getToolkit()

queryFileSearch(prompt, config, model?, apiKey?)

createFileSearchStore(displayName?, apiKey?)

createEphemeralToken(config?, apiKey?)

Files API Quick Functions

uploadFile(filePath, config?, apiKey?)

getFile(fileName, apiKey?)

listFiles(pageSize?, apiKey?)

deleteFile(fileName, apiKey?)

Context Caching Quick Functions

createCache(model, config, apiKey?)

listCaches(apiKey?)

getCache(cacheName, apiKey?)

updateCache(cacheName, config, apiKey?)

deleteCache(cacheName, apiKey?)

Token Counting Quick Functions

countTokens(contents, model?, apiKey?)

connectMusic(callbacks, apiKey?)

uploadToFileSearchStore(filePath, fileSearchStoreName, config?, apiKey?)

queryWithUrlContext(prompt, model?, apiKey?)

queryWithUrlContextAndSearch(prompt, model?, apiKey?)

Class API (GeminiToolkit)

📚 API Reference

Service-Based Architecture (Recommended)

Constructor

Service Properties

CoreAIService

ChatService

GroundingService

FileSearchService

FilesService

CacheService

TokenService

Legacy Direct Methods (Deprecated)

Deprecated Methods

generateText(prompt, options?)

createChat(model?)

generateImage(prompt, options?)

editImage(imageBase64, mimeType, prompt, model?)

analyzeMedia(data, mimeType, prompt, options?)

generateVideo(imageBase64, mimeType, prompt, options?)

generateSpeech(text, options?)

createEphemeralToken(config?)

connectLive(callbacks, options?, ephemeralToken?)

`generateText(prompt, options?, apiKey?)`

`generateImage(prompt, options?, apiKey?)`

`createChat(model?, apiKey?)`

`generateSpeech(text, options?, apiKey?)`

`search(query, apiKey?)`

`findNearby(query, location, apiKey?)`

`analyzeImage(imageBase64, prompt, mimeType, options?, apiKey?)`

`editImage(imageBase64, mimeType, prompt, apiKey?)`

`init(apiKey)`

`getToolkit()`

`queryFileSearch(prompt, config, model?, apiKey?)`

`createFileSearchStore(displayName?, apiKey?)`

`createEphemeralToken(config?, apiKey?)`

`uploadFile(filePath, config?, apiKey?)`

`getFile(fileName, apiKey?)`

`listFiles(pageSize?, apiKey?)`

`deleteFile(fileName, apiKey?)`

`createCache(model, config, apiKey?)`

`listCaches(apiKey?)`

`getCache(cacheName, apiKey?)`

`updateCache(cacheName, config, apiKey?)`

`deleteCache(cacheName, apiKey?)`

`countTokens(contents, model?, apiKey?)`

`connectMusic(callbacks, apiKey?)`

`uploadToFileSearchStore(filePath, fileSearchStoreName, config?, apiKey?)`

`queryWithUrlContext(prompt, model?, apiKey?)`

`queryWithUrlContextAndSearch(prompt, model?, apiKey?)`

`generateText(prompt, options?)`

`createChat(model?)`

`generateImage(prompt, options?)`

`editImage(imageBase64, mimeType, prompt, model?)`

`analyzeMedia(data, mimeType, prompt, options?)`

`generateVideo(imageBase64, mimeType, prompt, options?)`

`generateSpeech(text, options?)`

`createEphemeralToken(config?)`

`connectLive(callbacks, options?, ephemeralToken?)`

`connectMusic(callbacks, apiKey?)`

`groundWithSearch(prompt, model?)`

`groundWithMaps(prompt, location?, model?)`

`generateWithUrlContext(prompt, model?)`

`generateWithUrlContextAndSearch(prompt, model?)`

`generateWithThinking(prompt, thinkingBudget?, model?)`

`uploadFile(filePath, config?)`

`getFile(fileName)`

`listFiles(pageSize?)`

`deleteFile(fileName)`

`createCache(model, config)`

`listCaches()`

`getCache(cacheName)`

`updateCache(cacheName, config)`

`deleteCache(cacheName)`

`countTokens(contents, model?)`

`createEphemeralToken(config?)`

`createFileSearchStore(displayName?)`

`listFileSearchStores()`

`getFileSearchStore(name)`

`deleteFileSearchStore(name, force?)`

`uploadToFileSearchStore(filePath, fileSearchStoreName, config?)`

`uploadFile(filePath, config?)`

`getFile(fileName)`

`listFiles(pageSize?)`

`deleteFile(fileName)`

`importFileToFileSearchStore(fileSearchStoreName, fileName, config?)`

`queryWithFileSearch(prompt, config, model?)`