tiny-gemini
v1.2.0
Published
Zero-dependency CLI for the Google Gemini API — text, images, TTS, search, and more via npx
Maintainers
Readme
tiny-gemini
Zero-dependency CLI for the Google Gemini API. Text, images, TTS, search, deep research, and raw API passthrough — all through npx.
Table of Contents
- Why This Exists
- Quick Start
- Commands Overview
- Global Options
- Agentic Workflow (--prompt-file, --output-file)
- Stack and Dependencies
- Project Structure
- Model Selection
- Detailed Documentation
- Releasing
- Reference Material
- Changelog
- License
Why This Exists
Google's Gemini API is a single unified endpoint (the Interactions API) that handles text, images, audio, search, research, and more — all through the request body. But using it requires constructing JSON payloads, managing headers, parsing multimodal responses, and converting binary formats.
tiny-gemini wraps this API with dedicated subcommands for common use cases (prompt, image, tts, search, research) plus a raw JSON passthrough for full API coverage. It is:
- Zero-dependency — only Node.js built-ins (no
node_modules) - Single file — everything in
cli.js(~1070 lines) - NPX-ready —
npx tiny-gemini "your prompt"works immediately - Complete — the
rawcommand covers 100% of the API surface
Quick Start
Install
# Use directly via npx (no install needed)
npx tiny-gemini "What is quantum computing?"
# Or install globally
npm install -g tiny-geminiRequires Node.js >= 18.0.0.
API Key Setup
Get an API key from Google AI Studio.
Interactive setup (first-time users): If no API key is configured, the CLI detects whether you're in a terminal (TTY) and offers to save the key for you:
$ npx tiny-gemini "hello"
No API key found. You need a free Google Gemini API key.
1. Go to https://aistudio.google.com/app/apikey
2. Click "Create API key" and copy it.
Paste it below to save it, or press Enter to skip: ************************************
Saved to ~/.gemini/.envThe key is saved to ~/.gemini/.env and the original command continues immediately. Input is masked with * characters. In non-TTY environments (pipes, scripts, LLM agents), a concise error with setup commands is shown instead.
Option A: Environment variable (recommended)
export GEMINI_API_KEY="your-key-here"Add to ~/.zshrc or ~/.bashrc to persist.
Option B: .gemini/.env file (matches official Gemini CLI convention)
# User-wide (works from any directory)
mkdir -p ~/.gemini
echo 'GEMINI_API_KEY=your-key-here' > ~/.gemini/.env
# Or project-level (takes priority)
mkdir -p .gemini
echo 'GEMINI_API_KEY=your-key-here' > .gemini/.envOption C: CLI flag
npx tiny-gemini --api-key=your-key-here "Hello"Key resolution order: --api-key > TINY_GEMINI_API_KEY > GEMINI_API_KEY > GOOGLE_API_KEY > .gemini/.env (project, searching up) > ~/.gemini/.env.
Commands Overview
prompt (default)
Text generation. The command name is optional — any unrecognized first argument is treated as a prompt.
tiny-gemini "What is quantum computing?"
tiny-gemini prompt "Describe this" --file photo.png
tiny-gemini "Summarize" --file doc.pdf
tiny-gemini "Tell me a joke" --stream
tiny-gemini "Extract name and age" --schema '{"type":"object","properties":{"name":{"type":"string"},"age":{"type":"integer"}}}'
tiny-gemini "Fix bugs" --prompt-file src/app.js --output-file result.jsonKey options: --file, --prompt-file, --output-file, --output-format, --system, --schema, --stream, --model
image
Image generation, editing, and understanding with 7 sub-commands.
tiny-gemini image "a cat on the moon" # generate (default)
tiny-gemini image generate "a cat" --count=3 --styles=watercolor,sketch
tiny-gemini image edit photo.png "add sunglasses"
tiny-gemini image describe photo.png
tiny-gemini image story "a seed growing" --steps=4
tiny-gemini image icon "coffee cup" --style=modern
tiny-gemini image pattern "geometric" --type=seamless
tiny-gemini image diagram "login flow" --type=flowchartKey options: --count, --styles, --variations, --steps, --style, --type, --aspect-ratio, --image-size
tts
Text-to-speech. Outputs a .wav file.
tiny-gemini tts "Hello, how are you today?"
tiny-gemini tts "Bonjour" --voice=kore --language=fr-frKey options: --voice (default: kore), --language (default: en-us)
search
Google Search-grounded generation.
tiny-gemini search "Who won the 2026 Super Bowl?"
tiny-gemini search "latest React release" --stream
tiny-gemini search "AI news" --output-file results.txtresearch
Deep Research agent. Runs in background, polls for completion.
tiny-gemini research "History of Google TPUs focusing on 2025-2026"raw
JSON passthrough — sends any JSON body directly to the Interactions API. Escape hatch for function calling, MCP, code execution, computer use, or anything else the API supports.
tiny-gemini raw '{"model":"gemini-3-flash-preview","input":"hello"}'
echo '{"model":"...","input":"..."}' | tiny-gemini raw
tiny-gemini raw --file request.jsonGlobal Options
| Option | Env Var | Default | Description |
|--------|---------|---------|-------------|
| --api-key | TINY_GEMINI_API_KEY > GEMINI_API_KEY > GOOGLE_API_KEY | — | API key |
| --api-base | TINY_GEMINI_API_BASE | https://generativelanguage.googleapis.com/v1beta | API base URL |
| --model | TINY_GEMINI_MODEL | per-command default | Model override |
| --output-dir | — | ./tiny-gemini-output | Output directory for generated files |
| --prompt-file | — | — | Read file contents into prompt (repeatable) |
| --output-file | — | — | Write response to file instead of stdout |
| --output-format | — | auto | plain or manifest (see Agentic Workflow) |
| --stream | — | false | Enable streaming output |
| --preview | — | false | Open generated files after saving |
| --json-output | — | false | Print raw JSON response |
| -h, --help | — | — | Show help (supports per-command: image --help) |
| -v, --version | — | — | Show version |
Agentic Workflow (--prompt-file, --output-file)
When an AI agent uses tiny-gemini via bash, large files can bloat the agent's context window in both directions. Two flags solve this by keeping the CLI as a filesystem-to-API pipe:
--prompt-file <path>: The agent passes file paths, the CLI reads them and sends contents to Gemini. The agent never sees file contents in its context. Repeatable — use multiple times for multiple files.--output-file <path>: Gemini responds, the CLI writes to disk, and the agent sees only a short summary on stdout. The agent never sees response contents in its context.
How It Works
Input side: --prompt-file reads each file as UTF-8 and wraps it with filename delimiters before appending to the prompt:
--- FILE: src/app.js ---
<file contents>
--- END FILE: src/app.js -----prompt-file and --file serve different purposes and work together: --prompt-file injects text file contents into the prompt, while --file sends binary files (images, audio, video, PDF) as base64 multimodal content.
Output side: When --output-file is set, the CLI uses smart detection (or --output-format override) to choose between two modes:
- Plain text mode: Writes the response text directly to the output file.
- Manifest mode: Writes a small JSON manifest to the output file, and writes large text blocks to separate files in
--output-dir. This lets the agent read only the manifest to understand what Gemini produced.
Manifest Format
{
"outputs": [
{
"type": "text",
"preview": "Found 2 bugs: null ref on line 15, missing await...",
"file": "./tiny-gemini-output/text_1.txt",
"bytes": 245,
"lines": 8
}
],
"function_calls": [
{
"name": "write_file",
"id": "call_123",
"arguments": { "path": "src/app.js" }
}
],
"images": [
{ "file": "./tiny-gemini-output/prompt_1.png" }
],
"audio": []
}Smart Detection Rules
When --output-format is not specified, the CLI auto-detects the best format:
| Condition | Result |
|-----------|--------|
| --output-format=manifest | Always manifest |
| --output-format=plain | Always plain text |
| Response has function calls | Manifest |
| Any text block > 4000 chars | Manifest |
| Otherwise | Plain text |
Agentic Examples
# Simple — short response goes to plain text file
tiny-gemini "What is 2+2?" --output-file answer.txt
# Stdout: "Response written to answer.txt"
# Code review — reads files, auto-detects manifest for large output
tiny-gemini "Fix bugs in these files" \
--prompt-file src/app.js \
--prompt-file src/utils.js \
--output-file /tmp/manifest.json
# Stdout: "Manifest written to /tmp/manifest.json (3 text blocks, 847 lines)"
# Force manifest even for short responses
tiny-gemini "What is 2+2?" --output-file answer.json --output-format=manifest
# Force plain text even for long responses
tiny-gemini "Rewrite this" --prompt-file app.js --output-file rewritten.txt --output-format=plain
# Prompt-file only (no text prompt), with system instruction
tiny-gemini --prompt-file src/app.js --system "Explain this code"
# Combined with multimodal + schema
tiny-gemini "Fix bugs shown in this screenshot" \
--file screenshot.png \
--prompt-file src/app.js \
--output-file /tmp/manifest.jsonStack and Dependencies
| Component | Details |
|-----------|---------|
| Runtime | Node.js >= 18.0.0 |
| Dependencies | None — zero node_modules |
| API | Gemini Interactions API (v1beta) |
| HTTP | Built-in fetch (Node 18+) |
| Arg parsing | node:util parseArgs |
| File I/O | node:fs/promises |
| Audio encoding | Inline WAV header construction (no ffmpeg) |
| Config | .gemini/.env file loader (built-in, no dotenv package) |
| Module format | ESM ("type": "module") |
Project Structure
tiny-gemini/
├── cli.js # Single executable (~1070 lines, all logic)
├── package.json # NPX-ready with bin entry
├── CHANGELOG.md # Release history (Keep a Changelog format)
├── LICENSE # BSD-3-Clause
├── .gitignore
├── README.md
└── docs/
├── api-reference.md # Gemini Interactions API details
├── architecture.md # Code structure and how to add features
├── commands.md # Full command reference with request bodies
├── model-selection.md # Model comparison, pricing, and decision rules
├── prompt-engineering.md # Image presets, batch generation, variations
└── 20260307-gemini/ # Local snapshots of official Google docs
├── interactions.md
└── image-generation.mdModel Selection
See Model Selection Guide for a complete comparison of all Gemini models — capabilities, pricing, and decision rules for choosing the right model for each task.
Detailed Documentation
The following docs provide the technical depth needed to understand, extend, or debug the CLI. Start with the one that matches your task:
| Document | When to Read |
|----------|--------------|
| API Reference | Understanding the Gemini Interactions API: endpoint, headers, request/response format, streaming SSE protocol, output types, models, and known limitations |
| Architecture | Understanding the code structure, adding new commands or sub-commands, modifying config resolution, the .env loader, or the API client |
| Model Selection | Choosing which Gemini model to use: decision rules, capabilities, pricing, and comparison tables for text, image, and specialized models |
| Commands | Full reference for every command and option, including the exact request bodies sent to the API and how responses are processed |
| Prompt Engineering | Image generation presets (icon, pattern, diagram, story), batch generation with styles/variations, and how prompt builders work |
Releasing
New versions are released using the /release Claude Code skill. It automates the full workflow:
- Reasons about session changes to determine the semver bump type (or accepts
patch,minor,majoras an argument) - Runs preflight checks — clean working tree, correct branch
- Bumps the version in
package.jsonand syncscli.js - Updates
CHANGELOG.mdwith categorized changes in Keep a Changelog format - Verifies the CLI loads and shows the correct version
- Commits, tags, and pushes to GitHub
- Shows the
npm publishcommand for the user to run manually
/release patch # bug fixes
/release minor # new features
/release major # breaking changes
/release # agent decides and asks you to confirmThe skill is at .claude/skills/release/ and is user-invoked only — it never triggers automatically.
Reference Material
This project was built using the following official Google documentation. Local snapshots of the two primary sources are saved in docs/20260307-gemini/ for offline reference and to preserve the exact API state this project was built against.
| Source | Local Snapshot | Description |
|--------|----------------|-------------|
| Gemini Interactions API | docs/20260307-gemini/interactions.md | The unified API endpoint this CLI wraps |
| Gemini Image Generation | docs/20260307-gemini/image-generation.md | Image generation models, capabilities, and configuration |
| Gemini API Models | — | Available models and their capabilities |
| Gemini API Keys | — | API key setup and environment variable conventions |
| Gemini CLI Authentication | — | .gemini/.env file convention |
When the API changes or models are deprecated, compare the local snapshots against the live docs to understand what shifted.
Changelog
See CHANGELOG.md for release history. Format follows Keep a Changelog.
License
BSD-3-Clause
