describe-cli

v1.0.1

Published

10 months ago

Describe GIFs, images, and videos using Gemini

Downloads

0High
0Medium
0Low

tldraw-personal

tellmewhat

A command-line tool that uses Google's Gemini AI to analyze and describe images, GIFs, and videos. Perfect for generating detailed descriptions of visual content, especially for tldraw-related media.

Features

🖼️ Image Analysis: Supports PNG, JPG, and JPEG formats
🎬 Video Analysis: Supports MP4 format
🎞️ GIF Analysis: Automatically converts GIFs to MP4 for analysis
🤖 AI-Powered: Uses Google's Gemini 2.0 Flash model for detailed descriptions
⚡ Fast: Efficient processing with automatic cleanup
🎨 tldraw Optimized: Specialized prompts for tldraw whiteboard content

Installation

Prerequisites

Node.js 18 or higher
A Google API key with Gemini API access

Install from npm (coming soon)

npm install -g describe-cli

Build from source

git clone <repository-url>
cd describe-gif
npm install
npm run build
npm link

Usage

Basic Usage

describe <file-path> [options]

Examples

# Analyze an image with API key from environment
export GOOGLE_API_KEY="your-api-key-here"
describe ./screenshot.png

# Analyze a GIF with inline API key
describe ./animation.gif --api-key=your-api-key-here

# Use a different model
describe ./video.mp4 --model=gemini-1.5-pro --api-key=your-api-key-here

Options

--api-key=<key>: Google API key (alternatively set GOOGLE_API_KEY environment variable)
--model=<model>: Gemini model to use (default: gemini-2.0-flash)

Supported File Types

Images: .png, .jpg, .jpeg
Videos: .mp4
GIFs: .gif (automatically converted to MP4)

API Key Setup

Option 1: Environment Variable (Recommended)

export GOOGLE_API_KEY="your-api-key-here"

Add this to your shell profile (.bashrc, .zshrc, etc.) to make it permanent.

Option 2: Command Line Flag

describe image.png --api-key=your-api-key-here

Output

The tool provides detailed descriptions optimized for understanding visual content, with special attention to:

Main subjects and objects
Actions and movements (for videos/GIFs)
UI elements and interactions (for tldraw content)
Spatial relationships and layout
Changes and modifications over time

Example Output

✅ Description:
---
The image shows a tldraw whiteboard interface with several geometric shapes arranged on a white canvas. In the center, there's a blue rectangle connected to a red circle via a black arrow. The toolbar at the bottom displays various drawing tools including select, hand, draw, eraser, and shape tools. The style panel on the right shows color options and stroke settings, indicating the blue rectangle is currently selected with medium stroke width and solid fill.

Development

Setup

npm install

Build

npm run build

Test

npm test

Project Structure

src/
├── index.ts              # Main CLI entry point
├── gemini-analyzer.ts    # Gemini API integration
├── gif-converter.ts      # GIF to MP4 conversion
└── __tests__/           # Test files

Testing

The project includes comprehensive unit tests with:

94.44% statement coverage
84.61% branch coverage
100% function coverage

Tests mock the Gemini API to ensure fast, reliable testing without API calls.

npm test

Error Handling

The tool provides clear error messages for common issues:

Missing file path
File not found
Missing API key
Unsupported file types
API errors

Dependencies

@google/genai: Google Gemini AI SDK
ffmpeg-static: For GIF to MP4 conversion
dotenv: Environment variable loading

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests for new functionality
Ensure all tests pass
Submit a pull request

License

MIT License - see LICENSE file for details

Troubleshooting

Common Issues

"No API key provided"

Set the GOOGLE_API_KEY environment variable or use --api-key flag

"Unsupported file type"

Check that your file has a supported extension (.png, .jpg, .jpeg, .mp4, .gif)

"File not found"

Verify the file path is correct and the file exists

FFmpeg errors (GIF conversion)

The tool includes a bundled FFmpeg binary, but if you encounter issues, ensure your system supports the binary

Getting Help

If you encounter issues:

Check that your API key is valid and has Gemini API access
Verify the file exists and is in a supported format
Check the error message for specific guidance
Review the troubleshooting section above

For bugs or feature requests, please open an issue on the repository.