@genspark/cli

v1.0.19

Published

2 days ago

CLI tool for Genspark Tool API - search, crawl, analyze images, generate media

Genspark CLI (`gsk`)

One CLI. Every AI capability. Search, generate, analyze, communicate — all from your terminal.

gsk is the command-line interface for the Genspark AI platform. It unifies 90+ AI tools behind a single binary: web search, image/video/audio generation with 40+ models, document analysis, media transcription, cloud file management, email (Gmail & Outlook), calendar, GitHub, Slack, Notion, Microsoft Teams, OneDrive, SharePoint, AI phone calls, stock data, social media data (Twitter, Instagram, Reddit), and autonomous AI agents — all with clean JSON output for seamless integration with AI coding assistants, automation pipelines, and scripts.

Capability Map

| Category | What You Get | |----------|-------------| | 🔍 Search | Web search, image search | | 📄 Documents | Crawl pages, summarize PDFs/docs | | 🎨 Images | 16 models: GPT Image, Gemini, Flux 2, Imagen 4, Recraft, Ideogram, Seedream ... | | 🎬 Videos | 14 models: Kling V3, Veo 3.1, Sora 2, Hailuo, Wan, Runway, PixVerse, Seedance ... | | 🎵 Audio | 14 models: Gemini TTS, ElevenLabs, MiniMax, Mureka, CassetteAI, Lyria 2 ... | | 🧠 Analysis | Image/video/audio understanding, OCR, video style replication | | 📝 Transcribe | Whisper, Gemini, ElevenLabs Scribe | | ☁️ AI Drive | Cloud file storage, download, compress | | 📧 Email | Gmail & Outlook: read, search, send, reply, forward, archive, labels, attachments | | 📅 Calendar | Google & Outlook: list, create, delete events | | 💬 Collaboration | Slack, Microsoft Teams, Notion — send messages, search, manage channels/pages | | 📂 Cloud Storage | Google Drive, OneDrive, SharePoint, Google Sheets, Google Docs, Google Contacts | | 🐙 GitHub | List repos, search/create/update issues | | 📞 Phone | AI-powered phone calls to businesses | | 📈 Stocks | Real-time stock prices | | 📱 Social Media | Twitter/X, Instagram, Reddit — search posts/users, get comments, connections, and more (30 APIs) | | 🤖 Agents | Podcasts, docs, slides, deep research, fact-checking, websites, batch media generation | | 🔊 Voice | Voice cloning, voice changer |

Installation

npm install -g @genspark/cli

Requires Node.js >= 18.

Quick Start

# Log in via browser
gsk login

# Search the web
gsk search "latest AI news"

# Generate an image
gsk img "A beautiful sunset over mountains" -o ./sunset.png

# Crawl a webpage
gsk crawl "https://example.com/article"

Authentication

gsk login

This opens a browser for authentication and saves the API key to ~/.genspark-tool-cli/config.json.

Alternatively, provide an API key directly:

# Via environment variable
export GSK_API_KEY="gsk_..."

# Via CLI option
gsk search "query" --api-key "gsk_..."

To check your current identity:

gsk login-info
gsk me          # shorthand

To log out:

gsk logout

Configuration

Config is loaded from three sources (highest priority first):

CLI options — --api-key, --base-url, etc.
Environment variables — GSK_API_KEY, GSK_BASE_URL, GSK_PROJECT_ID
Config file — ~/.genspark-tool-cli/config.json

{
  "api_key": "gsk_...",
  "base_url": "https://www.genspark.ai",
  "project_id": "project_abc123",
  "debug": false,
  "timeout": 300000
}

Global Options

| Option | Env Var | Default | Description | |--------|---------|---------|-------------| | --api-key <key> | GSK_API_KEY | — | API key (required) | | --base-url <url> | GSK_BASE_URL | https://www.genspark.ai | API base URL | | --project-id <id> | GSK_PROJECT_ID | — | Project ID for access control | | --debug | — | false | Enable debug output | | --timeout <ms> | — | 300000 (5 min) | Request timeout | | --output <format> | — | json | Output format: json or text | | --refresh | — | — | Force refresh cached tool schemas |

Commands

list-tools (alias: `ls`)

List all available tools.

gsk list-tools
gsk ls

login-info (alias: `me`)

Show your current account info — email, name, and membership plan.

gsk login-info
gsk me

init-opencode

Generate an .opencode.json config file for OpenCode, pre-configured to use Genspark's LLM proxy with your API key.

# Generate with default model (claude-opus-4-6-1m)
gsk init-opencode

# Specify a different default model
gsk init-opencode --model claude-sonnet-4-6

# Write to a custom path
gsk init-opencode -o ./my-project/.opencode.json

| Option | Default | Description | |--------|---------|-------------| | --model <name> | claude-opus-4-6-1m | Default model for OpenCode | | -o, --out <path> | .opencode.json (cwd) | Output file path |

init-skills

Sync GSK skill documents into the current project for AI agent discovery. Copies all skill docs and generates a CONTEXT.md entry point that AI agents (Claude Code, Gemini, etc.) can load automatically.

# Copy skills to .gsk/skills/ and generate CONTEXT.md
gsk init-skills

# Also generate .claude/ config for Claude Code
gsk init-skills --agent claude

# Generate config for all supported agents (Claude, Gemini)
gsk init-skills --agent all

# Custom output directory
gsk init-skills -o ./docs/gsk-skills

| Option | Default | Description | |--------|---------|-------------| | -o, --out <dir> | .gsk/skills (cwd) | Output directory for skills | | --agent <type> | — | Generate agent config: claude, gemini, or all |

Search & Crawl

web_search (alias: `search`)

Search the web.

gsk search "latest AI news"

crawler (alias: `crawl`)

Extract content from a web page or document.

gsk crawl "https://example.com/article"

summarize_large_document (alias: `summarize`)

Analyze a document and answer questions about it.

gsk summarize "https://example.com/report.pdf" --question "What are the key findings?"

| Option | Description | |--------|-------------| | <url> | Document URL (required, positional) | | --question <text> | Question about the document |

image_search (alias: `img-search`)

Search for images on the web.

gsk img-search "modern architecture"

Media Analysis & Transcription

understand_images (alias: `analyze`)

Analyze images with AI vision model.

gsk analyze "Describe this image" -i "https://example.com/image.jpg"
gsk analyze "Extract all text" -i "https://img1.jpg" "https://img2.jpg"
gsk analyze "What's in this photo?" -i ./photo.jpg

| Option | Default | Description | |--------|---------|-------------| | -i, --image_urls <url...> | — | Image URL(s) or local file path(s) to analyze (required) | | -r, --instruction <text> | — | Custom analysis instruction |

Image Generation

image_generation (alias: `img`)

Generate images using AI. Supports text-to-image and image-to-image.

# Text-to-image
gsk img "A beautiful sunset over mountains" -r "16:9" -o ./sunset.png
gsk img "Modern office at night" -s "4k" -r "1:1"

# Image-to-image (reference-based)
gsk img "A portrait in similar style" -i ./reference.png

| Option | Default | Description | |--------|---------|-------------| | -r, --aspect_ratio <ratio> | 1:1 | Aspect ratio (1:1, 16:9, 9:16) | | -s, --image_size <size> | auto | Image size: auto, 2k, 4k | | -m, --model <name> | — | Model to use (optional) | | -i, --image_urls <url...> | — | Reference image URL(s) or local file(s) for image-to-image | | -o, --output-file <path> | — | Download the generated file to a local path |

Video Generation

video_generation (alias: `video`)

Generate videos using AI.

gsk video "A cat playing with yarn" -m "kling/v1.6/standard" -d 5 -o ./cat.mp4
gsk video "Sunrise over a beach" -m "minimax/hailuo-02/standard" -r "16:9" -d 8

# Image-to-video
gsk video "Camera pan around the subject" -m "kling/v1.6/standard" -i ./photo.jpg

| Option | Default | Description | |--------|---------|-------------| | -m, --model <name> | — | Model (required). e.g., kling/v1.6/standard, minimax/hailuo-02/standard | | -r, --aspect_ratio <ratio> | 16:9 | Aspect ratio | | -d, --duration <sec> | 5 | Duration in seconds (2-15) | | -i, --image_urls <url...> | — | Reference image URL(s) or local file(s) | | -a, --audio_url <url> | — | Audio URL for soundtrack | | -o, --output-file <path> | — | Download the generated file to a local path |

Audio Generation

audio_generation (alias: `audio`)

Generate audio: TTS, music, or sound effects.

# Text-to-speech
gsk audio "Hello, welcome to Genspark!" -m "google/gemini-2.5-pro-preview-tts" -r "professional female voice"
gsk audio "Hello, welcome to Genspark!" -m "google/gemini-2.5-pro-preview-tts" -o ./hello.mp3

# Music with lyrics
gsk audio "A pop song" -m "fal-ai/minimax/speech-2.6-hd" -l "Verse 1: ..." -d 120

# Sound effect
gsk audio "Door creaking slowly open" -m "sfx-model"

| Option | Default | Description | |--------|---------|-------------| | -m, --model <name> | — | Model (required). e.g., elevenlabs/v3-tts, fal-ai/minimax/speech-2.6-hd | | -d, --duration <sec> | 0 (auto) | Duration in seconds | | -r, --requirements <text> | — | Voice requirements for TTS | | -l, --lyrics <text> | — | Lyrics for song generation | | -o, --output-file <path> | — | Download the generated file to a local path |

File Transfer

upload

Upload a local file and get a URL for use in other commands.

gsk upload "./image.png"
gsk upload "./document.pdf"

download

Download a file from a file wrapper URL.

# Get download URL only
gsk download "/api/files/s/abc123"

# Download and save to local file
gsk download "/api/files/s/abc123" -s "./downloaded.png"

| Option | Description | |--------|-------------| | -s, --save <path> | Download and save to local file path |

analyze_media (alias: `media-analyze`)

Analyze various types of media content including images, audio, and video.

gsk media-analyze -i "https://example.com/image.jpg" -r "Describe the content"
gsk media-analyze -i "https://example.com/video.mp4" -r "Summarize the video"

| Option | Default | Description | |--------|---------|-------------| | -i, --media_urls <urls> | — | Media URL(s) to analyze (required) | | -r, --requirements <text> | — | Analysis instructions |

audio_transcribe (alias: `transcribe`)

Transcribe audio files to text.

gsk transcribe -i "https://example.com/audio.mp3"
gsk transcribe -i ./meeting.wav -m "whisper-large-v3"

| Option | Default | Description | |--------|---------|-------------| | -i, --audio_urls <url...> | — | Audio URL(s) or local file(s) to transcribe (required) | | -m, --model <name> | — | Transcription model to use |

AI Drive (Cloud Storage)

aidrive (alias: `drive`)

AI-Drive file storage and management. List, create, delete, move files and directories. Download videos, audio, and files from URLs directly to AI-Drive.

# List files in root directory
gsk drive ls
gsk drive ls -p "/documents" -f file

# Create directory
gsk drive mkdir -p "/my-folder"

# Move file
gsk drive move -p "/old-path/file.txt" --target_path "/new-path/file.txt"

# Download video/audio/file to AI-Drive
gsk drive download_video --video_url "https://example.com/video.mp4" --target_folder "/videos"
gsk drive download_file --file_url "https://example.com/doc.pdf" --target_folder "/docs"

# Upload inline text content to AI-Drive
gsk drive upload --file_content "Hello World" --upload_path "/notes/hello.txt"

# Upload a local file directly to AI-Drive (streaming, supports 100MB+ files)
gsk drive upload --local_file ./report.pdf --upload_path /docs/report.pdf
gsk drive upload --local_file ./video.mp4 --upload_path /videos/demo.mp4
gsk drive upload --local_file ./photo.png              # upload_path defaults to /photo.png
gsk drive upload --local_file ./doc.pdf --upload_path /docs/doc.pdf --override  # overwrite existing

# Get readable URL for a file
gsk drive get_readable_url -p "/documents/report.pdf"

| Option | Default | Description | |--------|---------|-------------| | -p, --path <path> | — | File or directory path in AI-Drive | | -f, --filter_type <type> | all | Filter: all, file, directory | | --file_type <type> | all | File type filter: all, audio, video, image | | --target_path <path> | — | Target path for move operations | | --target_folder <path> | — | Target folder for downloads | | --video_url <url> | — | Video URL for download_video action | | --audio_url <url> | — | Audio URL for download_audio action | | --file_url <url> | — | File URL for download_file action | | --file_name <name> | — | Custom file name for downloads | | --file_content <text> | — | Inline text content to upload | | --local_file <path> | — | Local file path to upload directly to AI-Drive (streaming, no size limit) | | --upload_path <path> | — | Destination path for upload (defaults to /<filename> for --local_file) | | --override | false | Overwrite an existing file at the destination path |

AI Agents & Tasks

create_task (alias: `task`)

Create and execute tasks using specialized AI agents.

# Create a podcast
gsk task podcasts --task_name "AI Trends" --query "Create a podcast about AI trends" --instructions "Focus on practical applications"

# Create a document
gsk task docs --task_name "Quantum Report" --query "Write a report on quantum computing" --instructions "Include recent breakthroughs"

# Create slides
gsk task slides --task_name "Q4 Results" --query "Create a Q4 results presentation" --instructions "Use charts and data"

# Create a spreadsheet (returns file wrapper URL, use `gsk download` to save)
gsk task sheets --task_name "Sales Report" --query "Create a quarterly sales report with formulas" --instructions "Use formulas and formatting"

# Deep research
gsk task deep_research --task_name "Fusion Energy" --query "Research fusion energy advances" --instructions "Cover public and private sector"

# Fact-check a claim
gsk task cross_check --task_name "Earth shape" --query "The Earth is flat" --instructions "Verify this claim with evidence"

| Option | Default | Description | |--------|---------|-------------| | --task_name <name> | — | Name for the task (required) | | --query <text> | — | Query describing what to create (required) | | --instructions <text> | — | Detailed instructions (required) | | --acp | false | Start as ACP (Agent Client Protocol) stdio agent for multi-turn use with Genspark Claw |

Supported task types: super_agent, podcasts, docs, slides, sheets, deep_research, website, video_generation, audio_generation, meeting_notes, cross_check

ACP Mode

Use --acp to start a task agent as an Agent Client Protocol stdio server. This enables AI agent platforms like Genspark Claw to natively discover and interact with GSK agents, with multi-turn conversation support.

# Start an ACP agent for slides (used by acpx, not typically run manually)
gsk task slides --acp

# Start an ACP agent for documents
gsk task docs --acp

acpx configuration (~/.acpx/config.json):

{
  "agents": {
    "gsk-slides": { "command": "gsk task slides --acp" },
    "gsk-docs":   { "command": "gsk task docs --acp" },
    "gsk-sheets": { "command": "gsk task sheets --acp" }
  }
}

Then in Genspark Claw: /acp spawn gsk-slides to create and iterate on presentations via natural language.

Stock Prices

stock_price (alias: `stock`)

Retrieve stock price information and financial data.

gsk stock AAPL
gsk stock MSFT

Service-Level Tools

External service integrations are available as service-level tools — each service is a single command with an action parameter that dispatches to the underlying operation.

Requirements: Connect services in Genspark Account Settings → Integrations.

gmail

Gmail operations: search, read, send, reply, forward, delete, archive, move, mark_as_read, add_label, remove_label, create_label, get_attachment, list_send_as.

gsk gmail search --query "from:boss subject:report"
gsk gmail read --id 19cbfecd7fb14d46
gsk gmail send --to [email protected] --subject "Hello" --body "<p>Hi!</p>"
gsk gmail forward --message_id 19cbfecd7fb14d46 --to [email protected]
gsk gmail archive --message_id 19cbfecd7fb14d46

outlook_email

Outlook Email operations: search, read, send, reply, reply_draft, forward, delete, archive, move, mark_as_read, add_category, remove_category, get_attachment, group_list, group_search, group_read, group_reply.

gsk outlook_email search --queryString "quarterly report"
gsk outlook_email read --messageId AAMkAG...
gsk outlook_email send --to [email protected] --subject "Update" --body "Hi!"

google_calendar

Google Calendar operations: list, create, delete.

gsk google_calendar list
gsk google_calendar create --summary "Team Sync" --start_time "2026-04-20T10:00:00Z" --end_time "2026-04-20T11:00:00Z"

outlook_calendar

Outlook Calendar operations: list, create, delete.

gsk outlook_calendar list

meeting

Meeting notes operations: list, search, get.

gsk meeting list
gsk meeting search --keyword "quarterly planning"
gsk meeting get --task_id "e02fd0f1-..."

google_drive

Google Drive file operations: search, read, upload.

gsk google_drive search --query "budget 2026"
gsk google_drive read --file_id 1hq9kH63sc...

google_sheets

Google Sheets operations: create, read, write, append, search, export.

gsk google_sheets search --query "sales report"
gsk google_sheets read --spreadsheet_id 1ABC... --range "Sheet1!A1:D10"

google_docs

Google Docs operations: create, read, append, search.

gsk google_docs search --query "meeting notes"
gsk google_docs read --document_id 1ABC...

google_contacts

Google Contacts operations: search, get, create, update.

gsk google_contacts search --query "John"

github

GitHub operations: list_repos, search_issues, create_issue, update_issue.

gsk github list_repos
gsk github search_issues --q "repo:owner/repo is:open label:bug"
gsk github create_issue --owner myorg --repo myrepo --title "Bug report" --body "Description..."

slack

Slack messaging operations: send, search, lookup.

gsk slack search --query "deployment update"
gsk slack lookup --lookup_type channels --search_query "engineering"
gsk slack send --recipient "#general" --message "Hello team!"

notion

Notion page operations: search, read, create.

gsk notion search --query "project roadmap"
gsk notion read --page_id 2ce8b6a5-...

microsoft_teams

Microsoft Teams operations: send, list_channels, list_chats, list_teams, search, search_users, create_chat.

gsk microsoft_teams list_teams
gsk microsoft_teams list_channels --team_id 6c0db3a9-...
gsk microsoft_teams search --query "release notes"

onedrive

OneDrive file operations: list, search, read.

gsk onedrive search --query "presentation"
gsk onedrive list --folder_path "/Documents"

sharepoint

SharePoint operations: list, search, read_content, read_file.

gsk sharepoint search --query "company wiki"
gsk sharepoint list --site_id abc123

outlook_contacts

Outlook Contacts operations: search.

gsk outlook_contacts search --query "John"

AI Phone Calls

phone-call (alias: `call-for-me`)

Make an AI phone call on your behalf. The AI validates prerequisites, resolves contact info, and initiates the call.

# Call a business by phone number
gsk phone-call "Pizza Hut" -c "+1-555-123-4567" -p "Check if they deliver to my area"

# Call a business by Google Maps place_id
gsk phone-call "Joe's Pizza" -c "ChIJxxxxxxxx" --is_place_id -p "Reserve a table for 4"

# Dry run: validate and resolve contact info without initiating the call
gsk phone-call "Pizza Hut" -c "+1-555-123-4567" -p "Check hours" --dry-run

| Option | Default | Description | |--------|---------|-------------| | <recipient> | — | Name of the person or business to call (required, positional) | | -c, --contact_info <value> | — | Phone number or Google Maps place_id (required) | | --is_place_id | false | Treat contact_info as a Google Maps place_id | | -p, --purpose <value> | — | Purpose of the call (required) | | --dry-run | — | Only validate and resolve contact info, do not initiate the call |

Social Media

Retrieve data from Twitter/X, Instagram, and Reddit. All social commands are grouped under gsk social.

social twitter

Search and retrieve data from Twitter/X. 12 actions available.

# Search tweets
gsk social twitter search_posts -q "artificial intelligence" --start_date 2026-03-01 --language en

# Search users
gsk social twitter search_users -q "openai" --limit 5

# Get tweets by a specific author
gsk social twitter get_posts_by_author -q "elonmusk" --start_date 2026-01-01

# Get tweets by IDs
gsk social twitter get_posts_by_ids --post_ids "123456789,987654321"

# Get user profile
gsk social twitter get_user -q "elonmusk"

# Get followers or following
gsk social twitter get_user_connections -q "elonmusk" --connection_type followers

# Get users by keywords (mentioned in tweets)
gsk social twitter get_users_by_keywords -q "machine learning" --start_date 2026-01-01

# Get comments on a tweet
gsk social twitter get_comments -p "123456789" --start_date 2026-03-01

# Get quotes of a tweet
gsk social twitter get_quotes -p "123456789"

# Get retweets of a tweet
gsk social twitter get_retweets -p "123456789"

# Get users who interacted with a tweet
gsk social twitter get_post_interacting_users -p "123456789" --interaction_type retweeters

# Count posts matching a query
gsk social twitter count_posts -q "AI" --start_date 2026-03-01 --end_date 2026-03-10

| Option | Default | Description | |--------|---------|-------------| | <action> | — | Action to perform (required, positional) | | -q, --query <text> | — | Search query, username, or identifier | | -p, --post_id <id> | — | Tweet/post ID | | --post_ids <ids> | — | Comma-separated tweet IDs | | --connection_type <type> | followers | followers or following | | --interaction_type <type> | retweeters | commenters, quoters, or retweeters | | --start_date <YYYY-MM-DD> | — | Start date filter | | --end_date <YYYY-MM-DD> | — | End date filter | | --language <code> | — | Language filter (e.g., en, zh) | | --limit <n> | — | Max number of results |

Actions: search_posts, search_users, get_posts_by_author, get_posts_by_ids, get_user, get_user_connections, get_users_by_keywords, get_comments, get_quotes, get_retweets, get_post_interacting_users, count_posts

social instagram

Search and retrieve data from Instagram. 9 actions available.

# Search posts
gsk social instagram search_posts -q "travel photography" --start_date 2026-01-01

# Search users
gsk social instagram search_users -q "natgeo" --limit 5

# Get posts by a specific user
gsk social instagram get_posts_by_user -q "natgeo" --start_date 2026-03-01

# Get posts by IDs
gsk social instagram get_posts_by_ids --post_ids "abc123,def456"

# Get user profile
gsk social instagram get_user -q "natgeo"

# Get followers or following
gsk social instagram get_user_connections -q "natgeo" --connection_type following

# Get users by keywords
gsk social instagram get_users_by_keywords -q "landscape photographer"

# Get comments on a post
gsk social instagram get_comments -p "abc123" --start_date 2026-03-01

# Get users who liked or commented on a post
gsk social instagram get_post_interacting_users -p "abc123" --interaction_type likers

| Option | Default | Description | |--------|---------|-------------| | <action> | — | Action to perform (required, positional) | | -q, --query <text> | — | Search query, username, or identifier | | -p, --post_id <id> | — | Post ID | | --post_ids <ids> | — | Comma-separated post IDs | | --connection_type <type> | followers | followers or following | | --interaction_type <type> | likers | likers or commenters | | --start_date <YYYY-MM-DD> | — | Start date filter | | --end_date <YYYY-MM-DD> | — | End date filter | | --limit <n> | — | Max number of results |

Actions: search_posts, search_users, get_posts_by_user, get_posts_by_ids, get_user, get_user_connections, get_users_by_keywords, get_comments, get_post_interacting_users

social reddit

Search and retrieve data from Reddit. 9 actions available.

# Search posts (with sort and time filters)
gsk social reddit search_posts -q "rust programming" --sort top --time week -s "programming"

# Search comments
gsk social reddit search_comments -q "async await" -s "rust"

# Search users
gsk social reddit search_users -q "spez" --limit 5

# Search subreddits
gsk social reddit search_subreddits -q "machine learning" --limit 10

# Get a post with its comments
gsk social reddit get_post_with_comments -p "1abc2de"

# Get subreddit info with recent posts
gsk social reddit get_subreddit_with_posts -q "programming"

# Get subreddits by keywords
gsk social reddit get_subreddits_by_keywords -q "artificial intelligence"

# Get user profile
gsk social reddit get_user -q "spez"

# Get users by keywords (active in discussions)
gsk social reddit get_users_by_keywords -q "neural networks" -s "MachineLearning"

| Option | Default | Description | |--------|---------|-------------| | <action> | — | Action to perform (required, positional) | | -q, --query <text> | — | Search query, username, or subreddit name | | -p, --post_id <id> | — | Post ID | | -s, --subreddit <name> | — | Subreddit name filter | | --sort <order> | — | Sort: relevance, hot, top, new, comments | | --time <range> | — | Time filter: hour, day, week, month, year, all | | --start_date <YYYY-MM-DD> | — | Start date filter | | --end_date <YYYY-MM-DD> | — | End date filter | | --limit <n> | — | Max number of results |

Actions: search_posts, search_comments, search_users, search_subreddits, get_post_with_comments, get_subreddit_with_posts, get_subreddits_by_keywords, get_user, get_users_by_keywords

Local File Handling

Most commands that accept URLs also accept local file paths. The CLI automatically uploads local files before passing them to the API:

# These are equivalent:
gsk analyze "Describe this" -i ./photo.jpg
gsk img "Enhance this" -i ./photo.png -o ./result.png
gsk video "Animate this" -i ./frame.jpg -o ./video.mp4

Use -o / --output-file to save generated results directly to a local file.

Auto-Update

The CLI checks for updates every 4 hours and installs new versions in the background.

To disable auto-update:

# Via environment variable
export GSK_NO_AUTO_UPDATE=1

# Via config file
# Add "auto_update": false to ~/.genspark-tool-cli/config.json

Output Conventions

| Stream | Content | Consumer | |--------|---------|----------| | stdout | JSON result | Programs / AI agents | | stderr | Progress, debug, error messages | Human / logs |

This separation allows programs to parse clean JSON from stdout while humans can follow progress on stderr.

Available Models

| Model | Description | |-------|-------------| | nano-banana-2 | Gemini 3.1 Flash Image - Fast and efficient with advanced reasoning. Multi-image fusion with up to 14 references. Supports 0.5K-4K resolution | | fal-ai/gpt-image-1.5 | GPT Image 1.5 - Supports text-to-image and image editing with multi-image input | | imagen4 | Latest high quality image generation model, upgrade from Imagen 3 | | recraft-v3 | Realistic image generation model | | fal-ai/bytedance/seedream/v5/lite | Bytedance Seedream v5 Lite - Text-to-image and image editing with native 2K resolution and excellent text layout | | fal-ai/flux-2 | Flux 2 - Text-to-image and image editing with enhanced realism and crisp text generation. Supports up to 3 images for edit mode | | fal-ai/flux-2-pro | Flux 2 Pro - Higher quality version of Flux 2 with professional-grade output | | fal-ai/z-image/turbo | Z-Image Turbo - Optimized for speed. Good for quick iterations, bulk generation, and style transfer | | ideogram/V_3 | Ideogram V3 - Character reference specialist with superior facial feature preservation and character consistency | | qwen-image | Chinese poster specialist with outstanding Chinese text rendering and cultural context mastery | | bbox-segment | Extract subjects from images based on bounding box region | | fal-bria-rmbg | Remove background from image | | fal-ai/recraft-clarity-upscale | Upscale image | | fal-ai/image-editing/text-removal | Remove text and watermarks from images while preserving background | | flux-pro/outpaint | Expand image to a specific aspect ratio |

| Model | Capabilities | Aspect Ratios | Duration | Notes | |-------|-------------|---------------|----------|-------| | kling/v3 | Text/Image-to-video | 16:9, 9:16, 1:1 | 3-15s | Latest Kling V3 with audio. Pro/Standard quality modes | | gemini/veo3.1 | Text/Image-to-video | 16:9, 9:16 | 8s | Latest Veo with enhanced quality. Supports fast_mode and hd_mode (1080p) | | gemini/veo3.1/reference-to-video | Reference-to-video | 16:9, 9:16 | 8s | Generate video using 1+ reference images. Supports fast_mode and hd_mode | | gemini/veo3.1/first-last-frame-to-video | Frame transition | 16:9, 9:16 | 8s | Precise transitions from first to last frame. Requires exactly 2 images | | minimax/hailuo-2.3/standard | Text/Image-to-video | 16:9, 9:16 | 6s, 10s | Fast (~4min), cost-effective. Supports first & last frame control | | wan/v2.6 | Text/Image/Video-to-video | 16:9, 9:16, 1:1, 4:3, 3:4 | 5s, 10s, 15s | 1080p with audio. Supports reference-to-video with 1-3 reference videos | | vidu/q3 | Text/Image-to-video | 16:9, 9:16, 4:3, 3:4, 1:1 | 1-16s | Enhanced quality with audio generation. Resolution: 720p, 1080p | | runway/gen4_turbo | Image-to-video | 5:3, 3:5 | 5s, 10s | Fast, high quality. Requires reference image | | pixverse/v5 | Text/Image-to-video | 16:9, 9:16, 4:3, 1:1, 3:4 | 5s | Fast (~30s). Supports start/end frame transitions | | fal-ai/bytedance/seedance/v1.5/pro | Text/Image-to-video | 21:9, 16:9, 4:3, 1:1, 3:4, 9:16 | 4-12s | Seedance v1.5 Pro with native audio support. Supports first & last frame control | | sora-2 | Text/Image/Video-to-video | 16:9, 9:16 | 4s, 8s, 12s | OpenAI Sora 2 for fast, creative videos. Supports video remixing | | sora-2-pro | Text/Image-to-video | 16:9, 9:16 | 4s, 8s | Sora 2 Pro - Higher fidelity, cinematic quality. 720p and 1080p | | fal-ai/bytedance-upscaler/upscale/video | Video upscaling | — | — | Upscale existing videos to 2K. Requires video_url parameter | | xai/grok-imagine-video | Text/Image-to-video | 16:9, 4:3, 1:1, 3:4, 9:16, 21:9, 9:21 | 1-15s | xAI Grok Imagine Video. 720p HD output |

Text-to-Speech (TTS)

| Model | Description | |-------|-------------| | google/gemini-2.5-pro-preview-tts | Best, high-quality, realistic TTS. Supports one or multiple speakers with speaker prefixes (e.g., Speaker1: text, Speaker2: text) | | elevenlabs/v3-tts | Advanced multilingual TTS with multi-speaker dialogue support. Supports emotional tags like [excited], [whispers], [laughs] | | fal-ai/elevenlabs/tts/multilingual-v2 | High-quality multilingual TTS. Preferred for English | | fal-ai/minimax/speech-2.8-hd | High-quality multilingual TTS. Preferred for Chinese, Cantonese, Japanese, Korean. One speaker per generation |

Sound Effects

| Model | Description | |-------|-------------| | elevenlabs/sound-effects | Sound effect generation. Duration: 0.1-22 seconds |

Music Generation

| Model | Description | |-------|-------------| | elevenlabs/music | ElevenLabs music generation with vocals/singing. Lyrics auto-generated (no custom lyrics). Duration: 10s-5min | | CassetteAI/music-generator | Background music generation. Duration: 10-180 seconds | | mureka/song-generator | Professional song generation with lyrics. Supports style prompts, reference tracks, vocal and melody inputs. Max: 180s | | mureka/instrumental-generator | Instrumental music generation without vocals. Supports style prompts and reference tracks. Max: 180s | | fal-ai/lyria2 | Google Lyria 2 text-to-music. Good for sound effects and lyrics-free music. Max: 30 seconds | | fal-ai/minimax-music/v2.6 | Song generation with lyrics using MiniMax Music 2.6. Supports markers (Verse), (Chorus), (Bridge), etc. Requires style prompt and lyrics |

Voice Cloning & Transformation

| Model | Description | |-------|-------------| | elevenlabs/voice-clone | Clone a voice from audio samples. Returns voice ID for use in TTS generation | | elevenlabs/voice-changer | Transform audio from one voice to another. Requires source audio and target voice ID | | fal-ai/minimax/voice-clone | Clone a voice from a sample audio and generate speech from text prompts (gated feature) |

License

MIT