ak-gemini
v2.2.1
Published
AK's Generative AI Helper for doing... everything
Maintainers
Readme
ak-gemini
Modular, type-safe wrapper for Google's Gemini AI. Seven class exports for different interaction patterns — JSON transformation, chat, stateless messages, tool-using agents, code-writing agents, document Q&A, and embeddings — all sharing a common base.
npm install ak-geminiRequires Node.js 18+ and @google/genai.
Quick Start
export GEMINI_API_KEY=your-keyimport { Transformer, Chat, Message, ToolAgent, CodeAgent, RagAgent, Embedding } from 'ak-gemini';Classes
Transformer — JSON Transformation
Transform structured data using few-shot examples with validation and retry.
const transformer = new Transformer({
modelName: 'gemini-2.5-flash',
sourceKey: 'INPUT',
targetKey: 'OUTPUT'
});
await transformer.init();
await transformer.seed([
{
INPUT: { name: 'Alice' },
OUTPUT: { name: 'Alice', role: 'engineer', emoji: '👩💻' }
}
]);
const result = await transformer.send({ name: 'Bob' });
// → { name: 'Bob', role: '...', emoji: '...' }Validation & self-healing:
const result = await transformer.send({ name: 'Bob' }, {}, async (output) => {
if (!output.role) throw new Error('Missing role field');
return output;
});Chat — Multi-Turn Conversation
const chat = new Chat({
systemPrompt: 'You are a helpful assistant.'
});
const r1 = await chat.send('My name is Alice.');
const r2 = await chat.send('What is my name?');
// r2.text → "Alice"Message — Stateless One-Off
Each call is independent — no history maintained.
const msg = new Message({
systemPrompt: 'Extract entities as JSON.',
responseMimeType: 'application/json',
responseSchema: {
type: 'object',
properties: {
entities: { type: 'array', items: { type: 'string' } }
}
}
});
const result = await msg.send('Alice works at Acme in New York.');
// result.data → { entities: ['Alice', 'Acme', 'New York'] }ToolAgent — Agent with User-Provided Tools
Provide tool declarations and an executor function. The agent manages the tool-use loop automatically.
const agent = new ToolAgent({
systemPrompt: 'You are a research assistant.',
tools: [
{
name: 'http_get',
description: 'Fetch a URL',
parametersJsonSchema: {
type: 'object',
properties: { url: { type: 'string' } },
required: ['url']
}
}
],
toolExecutor: async (toolName, args) => {
if (toolName === 'http_get') {
const res = await fetch(args.url);
return { status: res.status, body: await res.text() };
}
},
onBeforeExecution: async (toolName, args) => {
console.log(`About to call ${toolName}`);
return true; // return false to deny
}
});
const result = await agent.chat('Fetch https://api.example.com/data');
console.log(result.text); // Agent's summary
console.log(result.toolCalls); // [{ name, args, result }]Streaming:
for await (const event of agent.stream('Fetch the data')) {
if (event.type === 'text') process.stdout.write(event.text);
if (event.type === 'tool_call') console.log(`Calling ${event.toolName}...`);
if (event.type === 'tool_result') console.log(`Result:`, event.result);
if (event.type === 'done') console.log('Done!');
}CodeAgent — Agent That Writes and Executes Code
Instead of calling tools one by one, the model writes JavaScript that can do everything — read files, write files, run commands — in a single script. Inspired by the code mode philosophy.
const agent = new CodeAgent({
workingDirectory: '/path/to/my/project',
onCodeExecution: (code, output) => {
console.log('Ran:', code.slice(0, 100));
console.log('Output:', output.stdout);
},
onBeforeExecution: async (code) => {
// Review code before execution
console.log('About to run:', code);
return true; // return false to deny
}
});
const result = await agent.chat('Find all TODO comments in the codebase');
console.log(result.text); // Agent's summary
console.log(result.codeExecutions); // [{ code, output, stderr, exitCode }]How it works:
- On
init(), gathers codebase context (file tree + key files like package.json) - Injects context into the system prompt so the model understands the project
- Model writes JavaScript using the
execute_codetool - Code runs in a Node.js child process that inherits
process.env - Output (stdout/stderr) feeds back to the model
- Model decides if more work is needed
Streaming:
for await (const event of agent.stream('Refactor the auth module')) {
if (event.type === 'text') process.stdout.write(event.text);
if (event.type === 'code') console.log('\n[Running code...]');
if (event.type === 'output') console.log('[Output]:', event.stdout);
if (event.type === 'done') console.log('\nDone!');
}Embedding — Vector Embeddings
Generate vector embeddings for similarity search, clustering, and classification.
const embedder = new Embedding({
modelName: 'gemini-embedding-001', // default
taskType: 'RETRIEVAL_DOCUMENT'
});
// Single text
const result = await embedder.embed('Hello world');
console.log(result.values); // [0.012, -0.034, ...]
// Batch
const results = await embedder.embedBatch(['Hello', 'World']);
// Cosine similarity (pure math, no API call)
const score = embedder.similarity(results[0].values, results[1].values);Embedding + RagAgent — Semantic Search Pipeline
Use embeddings to find relevant documents, then feed only the best matches to a RagAgent for grounded Q&A:
const embedder = new Embedding({ taskType: 'RETRIEVAL_DOCUMENT' });
const queryEmbedder = new Embedding({ taskType: 'RETRIEVAL_QUERY' });
// Index your documents
const docs = ['./docs/auth.md', './docs/billing.md', './docs/api.md', './docs/faq.md'];
const docTexts = await Promise.all(docs.map(f => fs.readFile(f, 'utf-8')));
const docVectors = await embedder.embedBatch(docTexts);
// Find the most relevant docs for a query
const query = 'How do I reset my API key?';
const queryVector = await queryEmbedder.embed(query);
const ranked = docVectors
.map((v, i) => ({ file: docs[i], score: embedder.similarity(queryVector.values, v.values) }))
.sort((a, b) => b.score - a.score);
// Feed only the top 2 matches to RagAgent
const rag = new RagAgent({ localFiles: ranked.slice(0, 2).map(r => r.file) });
const answer = await rag.chat(query);Stopping Agents
Both ToolAgent and CodeAgent support a stop() method to cancel execution mid-loop. This is useful for implementing user-facing cancel buttons or safety limits.
const agent = new CodeAgent({ workingDirectory: '.' });
// Stop from a callback
const agent = new ToolAgent({
tools: [...],
toolExecutor: myExecutor,
onBeforeExecution: async (toolName, args) => {
if (toolName === 'dangerous_tool') {
agent.stop(); // Stop the agent entirely
return false; // Deny this specific execution
}
return true;
}
});
// Stop externally (e.g., from a timeout or user action)
setTimeout(() => agent.stop(), 60_000);
const result = await agent.chat('Do some work');For CodeAgent, stop() also kills any currently running child process via SIGTERM.
Shared Features
All classes extend BaseGemini and share these features:
Authentication
// Gemini API (default)
new Chat({ apiKey: 'your-key' }); // or GEMINI_API_KEY env var
// Vertex AI
new Chat({ vertexai: true, project: 'my-gcp-project' });Token Estimation
const { inputTokens } = await instance.estimate({ some: 'payload' });
const cost = await instance.estimateCost({ some: 'payload' });Usage Tracking
const usage = instance.getLastUsage();
// { promptTokens, responseTokens, totalTokens, attempts, modelVersion, requestedModel, timestamp }Few-Shot Seeding
await instance.seed([
{ PROMPT: { x: 1 }, ANSWER: { y: 2 } }
]);Thinking Configuration
new Chat({
modelName: 'gemini-2.5-flash',
thinkingConfig: { thinkingBudget: 1024 }
});Google Search Grounding
Ground responses in real-time web search results. Available on all classes.
const chat = new Chat({
enableGrounding: true,
groundingConfig: { excludeDomains: ['example.com'] }
});
const result = await chat.send('Who won the 2026 Super Bowl?');
const sources = result.usage?.groundingMetadata?.groundingChunks;Warning: Google Search grounding costs ~$35/1k queries.
Rate Limit Handling (429)
All classes automatically retry on 429 RESOURCE_EXHAUSTED errors with exponential backoff. This is separate from Transformer's validation retry logic (maxRetries).
// Defaults: 5 retries, 1000ms initial delay (doubles each attempt + jitter)
const chat = new Chat({ systemPrompt: 'Hello' });
// Customize
const transformer = new Transformer({
resourceExhaustedRetries: 10, // more retries for high-throughput pipelines
resourceExhaustedDelay: 2000 // start with 2s backoff
});
// Disable entirely
const msg = new Message({ resourceExhaustedRetries: 0 });When a 429 is encountered, retries are logged at WARN level:
WARN: Rate limited (429). Retrying in 1234ms (attempt 1/5)...Context Caching
Reduce costs by caching repeated system prompts, documents, or tool definitions.
const chat = new Chat({ systemPrompt: longSystemPrompt });
// Create a cache
const cache = await chat.createCache({
ttl: '3600s',
displayName: 'my-system-prompt-cache'
});
// Use the cache (subsequent calls use cached tokens at reduced cost)
await chat.useCache(cache.name);
const result = await chat.send('Hello!');
// Clean up
await chat.deleteCache(cache.name);Billing Labels (Vertex AI)
new Transformer({
vertexai: true,
project: 'my-project',
labels: { app: 'pipeline', env: 'prod' }
});Constructor Options
All classes accept BaseGeminiOptions:
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| modelName | string | 'gemini-2.5-flash' | Gemini model to use |
| systemPrompt | string | varies by class | System prompt |
| apiKey | string | env var | Gemini API key |
| vertexai | boolean | false | Use Vertex AI |
| project | string | env var | GCP project ID |
| location | string | 'global' | GCP region |
| chatConfig | object | — | Gemini chat config overrides |
| thinkingConfig | object | — | Thinking features config |
| maxOutputTokens | number | 50000 | Max tokens in response (null removes limit) |
| logLevel | string | based on NODE_ENV | 'trace'|'debug'|'info'|'warn'|'error'|'none' |
| labels | object | — | Billing labels (Vertex AI) |
| enableGrounding | boolean | false | Enable Google Search grounding |
| groundingConfig | object | — | Grounding config (excludeDomains, timeRangeFilter) |
| cachedContent | string | — | Cached content resource name |
| resourceExhaustedRetries | number | 5 | Max retry attempts for 429 rate-limit errors |
| resourceExhaustedDelay | number | 1000 | Initial backoff delay (ms) for 429 retries |
Transformer-Specific
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| sourceKey/promptKey | string | 'PROMPT' | Key for input in examples |
| targetKey/answerKey | string | 'ANSWER' | Key for output in examples |
| contextKey | string | 'CONTEXT' | Key for context in examples |
| maxRetries | number | 3 | Retry attempts for validation |
| retryDelay | number | 1000 | Initial retry delay (ms) |
| responseSchema | object | — | JSON schema for output validation |
| asyncValidator | function | — | Global async validator |
ToolAgent-Specific
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| tools | array | — | Tool declarations (FunctionDeclaration format) |
| toolExecutor | function | — | async (toolName, args) => result |
| maxToolRounds | number | 10 | Max tool-use loop iterations |
| onToolCall | function | — | Notification callback when tool is called |
| onBeforeExecution | function | — | async (toolName, args) => boolean — gate execution |
| parallelToolCalls | boolean | number | true | Parallel tool execution: false = sequential, true = unlimited, number = concurrency limit |
CodeAgent-Specific
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| workingDirectory | string | process.cwd() | Directory for code execution |
| maxRounds | number | 10 | Max code execution loop iterations |
| timeout | number | 30000 | Per-execution timeout (ms) |
| onBeforeExecution | function | — | async (code) => boolean — gate execution |
| onCodeExecution | function | — | Notification after execution |
Message-Specific
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| responseSchema | object | — | Schema for structured output |
| responseMimeType | string | — | e.g. 'application/json' |
Embedding-Specific
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| taskType | string | — | 'RETRIEVAL_DOCUMENT', 'RETRIEVAL_QUERY', 'SEMANTIC_SIMILARITY', 'CLUSTERING' |
| title | string | — | Document title (only with RETRIEVAL_DOCUMENT) |
| outputDimensionality | number | — | Output vector dimensions |
| autoTruncate | boolean | true | Auto-truncate long inputs |
Exports
// Named exports
import { Transformer, Chat, Message, ToolAgent, CodeAgent, RagAgent, Embedding, BaseGemini, log } from 'ak-gemini';
import { extractJSON, attemptJSONRecovery } from 'ak-gemini';
// Default export (namespace)
import AI from 'ak-gemini';
new AI.Transformer({ ... });
new AI.Embedding({ ... });
// CommonJS
const { Transformer, Chat, Embedding } = require('ak-gemini');Testing
npm testAll tests use real Gemini API calls (no mocks). Rate limiting (429 errors) can cause intermittent failures.
Migration from v1.x
See MIGRATION.md for a detailed guide on upgrading from v1.x to v2.0.
