@genspark/cli
v1.0.13
Published
CLI tool for Genspark Tool API - search, crawl, analyze images, generate media
Readme
Genspark CLI (gsk)
One CLI. Every AI capability. Search, generate, analyze, communicate — all from your terminal.
gsk is the command-line interface for the Genspark AI platform. It unifies 90+ AI tools behind a single binary: web search, image/video/audio generation with 40+ models, document analysis, media transcription, cloud file management, email (Gmail & Outlook), calendar, GitHub, Slack, Notion, Microsoft Teams, OneDrive, SharePoint, AI phone calls, stock data, social media data (Twitter, Instagram, Reddit), and autonomous AI agents — all with clean JSON output for seamless integration with AI coding assistants, automation pipelines, and scripts.
Capability Map
| Category | What You Get | |----------|-------------| | 🔍 Search | Web search, image search | | 📄 Documents | Crawl pages, summarize PDFs/docs | | 🎨 Images | 16 models: GPT Image, Gemini, Flux 2, Imagen 4, Recraft, Ideogram, Seedream ... | | 🎬 Videos | 14 models: Kling V3, Veo 3.1, Sora 2, Hailuo, Wan, Runway, PixVerse, Seedance ... | | 🎵 Audio | 14 models: Gemini TTS, ElevenLabs, MiniMax, Mureka, CassetteAI, Lyria 2 ... | | 🧠 Analysis | Image/video/audio understanding, OCR, video style replication | | 📝 Transcribe | Whisper, Gemini, ElevenLabs Scribe | | ☁️ AI Drive | Cloud file storage, download, compress | | 📧 Email | Gmail & Outlook: read, search, send, reply, forward, archive, labels, attachments | | 📅 Calendar | Google & Outlook: list, create, delete events | | 💬 Collaboration | Slack, Microsoft Teams, Notion — send messages, search, manage channels/pages | | 📂 Cloud Storage | Google Drive, OneDrive, SharePoint, Google Sheets, Google Docs, Google Contacts | | 🐙 GitHub | List repos, search/create/update issues | | 📞 Phone | AI-powered phone calls to businesses | | 📈 Stocks | Real-time stock prices | | 📱 Social Media | Twitter/X, Instagram, Reddit — search posts/users, get comments, connections, and more (30 APIs) | | 🤖 Agents | Podcasts, docs, slides, deep research, fact-checking, websites, batch media generation | | 🔊 Voice | Voice cloning, voice changer |
Table of Contents
Installation
npm install -g @genspark/cliRequires Node.js >= 18.
Quick Start
# Log in via browser
gsk login
# Search the web
gsk search "latest AI news"
# Generate an image
gsk img "A beautiful sunset over mountains" -o ./sunset.png
# Crawl a webpage
gsk crawl "https://example.com/article"Authentication
Log in with your Genspark account:
gsk loginThis opens a browser for authentication and saves the API key to ~/.genspark-tool-cli/config.json.
Alternatively, provide an API key directly:
# Via environment variable
export GSK_API_KEY="gsk_..."
# Via CLI option
gsk search "query" --api-key "gsk_..."To check your current identity:
gsk login-info
gsk me # shorthandTo log out:
gsk logoutConfiguration
Config is loaded from three sources (highest priority first):
- CLI options —
--api-key,--base-url, etc. - Environment variables —
GSK_API_KEY,GSK_BASE_URL,GSK_PROJECT_ID - Config file —
~/.genspark-tool-cli/config.json
{
"api_key": "gsk_...",
"base_url": "https://www.genspark.ai",
"project_id": "project_abc123",
"debug": false,
"timeout": 300000
}Global Options
| Option | Env Var | Default | Description |
|--------|---------|---------|-------------|
| --api-key <key> | GSK_API_KEY | — | API key (required) |
| --base-url <url> | GSK_BASE_URL | https://www.genspark.ai | API base URL |
| --project-id <id> | GSK_PROJECT_ID | — | Project ID for access control |
| --debug | — | false | Enable debug output |
| --timeout <ms> | — | 300000 (5 min) | Request timeout |
| --output <format> | — | json | Output format: json or text |
| --refresh | — | — | Force refresh cached tool schemas |
Commands
list-tools (alias: ls)
List all available tools.
gsk list-tools
gsk lslogin-info (alias: me)
Show your current account info — email, name, and membership plan.
gsk login-info
gsk meinit-opencode
Generate an .opencode.json config file for OpenCode, pre-configured to use Genspark's LLM proxy with your API key.
# Generate with default model (claude-opus-4-6-1m)
gsk init-opencode
# Specify a different default model
gsk init-opencode --model claude-sonnet-4-6
# Write to a custom path
gsk init-opencode -o ./my-project/.opencode.json| Option | Default | Description |
|--------|---------|-------------|
| --model <name> | claude-opus-4-6-1m | Default model for OpenCode |
| -o, --out <path> | .opencode.json (cwd) | Output file path |
init-skills
Sync GSK skill documents into the current project for AI agent discovery. Copies all skill docs and generates a CONTEXT.md entry point that AI agents (Claude Code, Gemini, etc.) can load automatically.
# Copy skills to .gsk/skills/ and generate CONTEXT.md
gsk init-skills
# Also generate .claude/ config for Claude Code
gsk init-skills --agent claude
# Generate config for all supported agents (Claude, Gemini)
gsk init-skills --agent all
# Custom output directory
gsk init-skills -o ./docs/gsk-skills| Option | Default | Description |
|--------|---------|-------------|
| -o, --out <dir> | .gsk/skills (cwd) | Output directory for skills |
| --agent <type> | — | Generate agent config: claude, gemini, or all |
Search & Crawl
web_search (alias: search)
Search the web.
gsk search "latest AI news"crawler (alias: crawl)
Extract content from a web page or document.
gsk crawl "https://example.com/article"summarize_large_document (alias: summarize)
Analyze a document and answer questions about it.
gsk summarize "https://example.com/report.pdf" --question "What are the key findings?"| Option | Description |
|--------|-------------|
| <url> | Document URL (required, positional) |
| --question <text> | Question about the document |
image_search (alias: img-search)
Search for images on the web.
gsk img-search "modern architecture"Media Analysis & Transcription
understand_images (alias: analyze)
Analyze images with AI vision model.
gsk analyze "Describe this image" -i "https://example.com/image.jpg"
gsk analyze "Extract all text" -i "https://img1.jpg" "https://img2.jpg"
gsk analyze "What's in this photo?" -i ./photo.jpg| Option | Default | Description |
|--------|---------|-------------|
| -i, --image_urls <url...> | — | Image URL(s) or local file path(s) to analyze (required) |
| -r, --instruction <text> | — | Custom analysis instruction |
Image Generation
image_generation (alias: img)
Generate images using AI. Supports text-to-image and image-to-image.
# Text-to-image
gsk img "A beautiful sunset over mountains" -r "16:9" -o ./sunset.png
gsk img "Modern office at night" -s "4k" -r "1:1"
# Image-to-image (reference-based)
gsk img "A portrait in similar style" -i ./reference.png| Option | Default | Description |
|--------|---------|-------------|
| -r, --aspect_ratio <ratio> | 1:1 | Aspect ratio (1:1, 16:9, 9:16) |
| -s, --image_size <size> | auto | Image size: auto, 2k, 4k |
| -m, --model <name> | — | Model to use (optional) |
| -i, --image_urls <url...> | — | Reference image URL(s) or local file(s) for image-to-image |
| -o, --output-file <path> | — | Download the generated file to a local path |
Video Generation
video_generation (alias: video)
Generate videos using AI.
gsk video "A cat playing with yarn" -m "kling/v1.6/standard" -d 5 -o ./cat.mp4
gsk video "Sunrise over a beach" -m "minimax/hailuo-02/standard" -r "16:9" -d 8
# Image-to-video
gsk video "Camera pan around the subject" -m "kling/v1.6/standard" -i ./photo.jpg| Option | Default | Description |
|--------|---------|-------------|
| -m, --model <name> | — | Model (required). e.g., kling/v1.6/standard, minimax/hailuo-02/standard |
| -r, --aspect_ratio <ratio> | 16:9 | Aspect ratio |
| -d, --duration <sec> | 5 | Duration in seconds (2-15) |
| -i, --image_urls <url...> | — | Reference image URL(s) or local file(s) |
| -a, --audio_url <url> | — | Audio URL for soundtrack |
| -o, --output-file <path> | — | Download the generated file to a local path |
Audio Generation
audio_generation (alias: audio)
Generate audio: TTS, music, or sound effects.
# Text-to-speech
gsk audio "Hello, welcome to Genspark!" -m "google/gemini-2.5-pro-preview-tts" -r "professional female voice"
gsk audio "Hello, welcome to Genspark!" -m "google/gemini-2.5-pro-preview-tts" -o ./hello.mp3
# Music with lyrics
gsk audio "A pop song" -m "fal-ai/minimax/speech-2.6-hd" -l "Verse 1: ..." -d 120
# Sound effect
gsk audio "Door creaking slowly open" -m "sfx-model"| Option | Default | Description |
|--------|---------|-------------|
| -m, --model <name> | — | Model (required). e.g., elevenlabs/v3-tts, fal-ai/minimax/speech-2.6-hd |
| -d, --duration <sec> | 0 (auto) | Duration in seconds |
| -r, --requirements <text> | — | Voice requirements for TTS |
| -l, --lyrics <text> | — | Lyrics for song generation |
| -o, --output-file <path> | — | Download the generated file to a local path |
File Transfer
upload
Upload a local file and get a URL for use in other commands.
gsk upload "./image.png"
gsk upload "./document.pdf"download
Download a file from a file wrapper URL.
# Get download URL only
gsk download "/api/files/s/abc123"
# Download and save to local file
gsk download "/api/files/s/abc123" -s "./downloaded.png"| Option | Description |
|--------|-------------|
| -s, --save <path> | Download and save to local file path |
analyze_media (alias: media-analyze)
Analyze various types of media content including images, audio, and video.
gsk media-analyze -i "https://example.com/image.jpg" -r "Describe the content"
gsk media-analyze -i "https://example.com/video.mp4" -r "Summarize the video"| Option | Default | Description |
|--------|---------|-------------|
| -i, --media_urls <urls> | — | Media URL(s) to analyze (required) |
| -r, --requirements <text> | — | Analysis instructions |
audio_transcribe (alias: transcribe)
Transcribe audio files to text.
gsk transcribe -i "https://example.com/audio.mp3"
gsk transcribe -i ./meeting.wav -m "whisper-large-v3"| Option | Default | Description |
|--------|---------|-------------|
| -i, --audio_urls <url...> | — | Audio URL(s) or local file(s) to transcribe (required) |
| -m, --model <name> | — | Transcription model to use |
AI Drive (Cloud Storage)
aidrive (alias: drive)
AI-Drive file storage and management. List, create, delete, move files and directories. Download videos, audio, and files from URLs directly to AI-Drive.
# List files in root directory
gsk drive ls
gsk drive ls -p "/documents" -f file
# Create directory
gsk drive mkdir -p "/my-folder"
# Move file
gsk drive move -p "/old-path/file.txt" --target_path "/new-path/file.txt"
# Download video/audio/file to AI-Drive
gsk drive download_video --video_url "https://example.com/video.mp4" --target_folder "/videos"
gsk drive download_file --file_url "https://example.com/doc.pdf" --target_folder "/docs"
# Upload inline text content to AI-Drive
gsk drive upload --file_content "Hello World" --upload_path "/notes/hello.txt"
# Upload a local file directly to AI-Drive (streaming, supports 100MB+ files)
gsk drive upload --local_file ./report.pdf --upload_path /docs/report.pdf
gsk drive upload --local_file ./video.mp4 --upload_path /videos/demo.mp4
gsk drive upload --local_file ./photo.png # upload_path defaults to /photo.png
gsk drive upload --local_file ./doc.pdf --upload_path /docs/doc.pdf --override # overwrite existing
# Get readable URL for a file
gsk drive get_readable_url -p "/documents/report.pdf"
| Option | Default | Description |
|--------|---------|-------------|
| -p, --path <path> | — | File or directory path in AI-Drive |
| -f, --filter_type <type> | all | Filter: all, file, directory |
| --file_type <type> | all | File type filter: all, audio, video, image |
| --target_path <path> | — | Target path for move operations |
| --target_folder <path> | — | Target folder for downloads |
| --video_url <url> | — | Video URL for download_video action |
| --audio_url <url> | — | Audio URL for download_audio action |
| --file_url <url> | — | File URL for download_file action |
| --file_name <name> | — | Custom file name for downloads |
| --file_content <text> | — | Inline text content to upload |
| --local_file <path> | — | Local file path to upload directly to AI-Drive (streaming, no size limit) |
| --upload_path <path> | — | Destination path for upload (defaults to /<filename> for --local_file) |
| --override | false | Overwrite an existing file at the destination path |
AI Agents & Tasks
create_task (alias: task)
Create and execute tasks using specialized AI agents.
# Create a podcast
gsk task podcasts --task_name "AI Trends" --query "Create a podcast about AI trends" --instructions "Focus on practical applications"
# Create a document
gsk task docs --task_name "Quantum Report" --query "Write a report on quantum computing" --instructions "Include recent breakthroughs"
# Create slides
gsk task slides --task_name "Q4 Results" --query "Create a Q4 results presentation" --instructions "Use charts and data"
# Create a spreadsheet (returns file wrapper URL, use `gsk download` to save)
gsk task sheets --task_name "Sales Report" --query "Create a quarterly sales report with formulas" --instructions "Use formulas and formatting"
# Deep research
gsk task deep_research --task_name "Fusion Energy" --query "Research fusion energy advances" --instructions "Cover public and private sector"
# Fact-check a claim
gsk task cross_check --task_name "Earth shape" --query "The Earth is flat" --instructions "Verify this claim with evidence"| Option | Default | Description |
|--------|---------|-------------|
| --task_name <name> | — | Name for the task (required) |
| --query <text> | — | Query describing what to create (required) |
| --instructions <text> | — | Detailed instructions (required) |
| --acp | false | Start as ACP (Agent Client Protocol) stdio agent for multi-turn use with Genspark Claw |
Supported task types: super_agent, podcasts, docs, slides, sheets, deep_research, website, video_generation, audio_generation, meeting_notes, cross_check
ACP Mode
Use --acp to start a task agent as an Agent Client Protocol stdio server. This enables AI agent platforms like Genspark Claw to natively discover and interact with GSK agents, with multi-turn conversation support.
# Start an ACP agent for slides (used by acpx, not typically run manually)
gsk task slides --acp
# Start an ACP agent for documents
gsk task docs --acpacpx configuration (~/.acpx/config.json):
{
"agents": {
"gsk-slides": { "command": "gsk task slides --acp" },
"gsk-docs": { "command": "gsk task docs --acp" },
"gsk-sheets": { "command": "gsk task sheets --acp" }
}
}Then in Genspark Claw: /acp spawn gsk-slides to create and iterate on presentations via natural language.
Stock Prices
stock_price (alias: stock)
Retrieve stock price information and financial data.
gsk stock AAPL
gsk stock MSFTService-Level Tools
External service integrations are available as service-level tools — each service is a single command with an action parameter that dispatches to the underlying operation.
Requirements: Connect services in Genspark Account Settings → Integrations.
gmail
Gmail operations: search, read, send, reply, forward, delete, archive, move, mark_as_read, add_label, remove_label, create_label, get_attachment, list_send_as.
gsk gmail search --query "from:boss subject:report"
gsk gmail read --id 19cbfecd7fb14d46
gsk gmail send --to [email protected] --subject "Hello" --body "<p>Hi!</p>"
gsk gmail forward --message_id 19cbfecd7fb14d46 --to [email protected]
gsk gmail archive --message_id 19cbfecd7fb14d46outlook_email
Outlook Email operations: search, read, send, reply, reply_draft, forward, delete, archive, move, mark_as_read, add_category, remove_category, get_attachment, group_list, group_search, group_read, group_reply.
gsk outlook_email search --queryString "quarterly report"
gsk outlook_email read --messageId AAMkAG...
gsk outlook_email send --to [email protected] --subject "Update" --body "Hi!"google_calendar
Google Calendar operations: list, create, delete.
gsk google_calendar list
gsk google_calendar create --summary "Team Sync" --start_time "2026-04-20T10:00:00Z" --end_time "2026-04-20T11:00:00Z"outlook_calendar
Outlook Calendar operations: list, create, delete.
gsk outlook_calendar listmeeting
Meeting notes operations: list, search, get.
gsk meeting list
gsk meeting search --keyword "quarterly planning"
gsk meeting get --task_id "e02fd0f1-..."google_drive
Google Drive file operations: search, read, upload.
gsk google_drive search --query "budget 2026"
gsk google_drive read --file_id 1hq9kH63sc...google_sheets
Google Sheets operations: create, read, write, append, search, export.
gsk google_sheets search --query "sales report"
gsk google_sheets read --spreadsheet_id 1ABC... --range "Sheet1!A1:D10"google_docs
Google Docs operations: create, read, append, search.
gsk google_docs search --query "meeting notes"
gsk google_docs read --document_id 1ABC...google_contacts
Google Contacts operations: search, get, create, update.
gsk google_contacts search --query "John"github
GitHub operations: list_repos, search_issues, create_issue, update_issue.
gsk github list_repos
gsk github search_issues --q "repo:owner/repo is:open label:bug"
gsk github create_issue --owner myorg --repo myrepo --title "Bug report" --body "Description..."slack
Slack messaging operations: send, search, lookup.
gsk slack search --query "deployment update"
gsk slack lookup --lookup_type channels --search_query "engineering"
gsk slack send --recipient "#general" --message "Hello team!"notion
Notion page operations: search, read, create.
gsk notion search --query "project roadmap"
gsk notion read --page_id 2ce8b6a5-...microsoft_teams
Microsoft Teams operations: send, list_channels, list_chats, list_teams, search, search_users, create_chat.
gsk microsoft_teams list_teams
gsk microsoft_teams list_channels --team_id 6c0db3a9-...
gsk microsoft_teams search --query "release notes"onedrive
OneDrive file operations: list, search, read.
gsk onedrive search --query "presentation"
gsk onedrive list --folder_path "/Documents"sharepoint
SharePoint operations: list, search, read_content, read_file.
gsk sharepoint search --query "company wiki"
gsk sharepoint list --site_id abc123outlook_contacts
Outlook Contacts operations: search.
gsk outlook_contacts search --query "John"AI Phone Calls
phone-call (alias: call-for-me)
Make an AI phone call on your behalf. The AI validates prerequisites, resolves contact info, and initiates the call.
# Call a business by phone number
gsk phone-call "Pizza Hut" -c "+1-555-123-4567" -p "Check if they deliver to my area"
# Call a business by Google Maps place_id
gsk phone-call "Joe's Pizza" -c "ChIJxxxxxxxx" --is_place_id -p "Reserve a table for 4"
# Dry run: validate and resolve contact info without initiating the call
gsk phone-call "Pizza Hut" -c "+1-555-123-4567" -p "Check hours" --dry-run| Option | Default | Description |
|--------|---------|-------------|
| <recipient> | — | Name of the person or business to call (required, positional) |
| -c, --contact_info <value> | — | Phone number or Google Maps place_id (required) |
| --is_place_id | false | Treat contact_info as a Google Maps place_id |
| -p, --purpose <value> | — | Purpose of the call (required) |
| --dry-run | — | Only validate and resolve contact info, do not initiate the call |
Social Media
Retrieve data from Twitter/X, Instagram, and Reddit. All social commands are grouped under gsk social.
social twitter
Search and retrieve data from Twitter/X. 12 actions available.
# Search tweets
gsk social twitter search_posts -q "artificial intelligence" --start_date 2026-03-01 --language en
# Search users
gsk social twitter search_users -q "openai" --limit 5
# Get tweets by a specific author
gsk social twitter get_posts_by_author -q "elonmusk" --start_date 2026-01-01
# Get tweets by IDs
gsk social twitter get_posts_by_ids --post_ids "123456789,987654321"
# Get user profile
gsk social twitter get_user -q "elonmusk"
# Get followers or following
gsk social twitter get_user_connections -q "elonmusk" --connection_type followers
# Get users by keywords (mentioned in tweets)
gsk social twitter get_users_by_keywords -q "machine learning" --start_date 2026-01-01
# Get comments on a tweet
gsk social twitter get_comments -p "123456789" --start_date 2026-03-01
# Get quotes of a tweet
gsk social twitter get_quotes -p "123456789"
# Get retweets of a tweet
gsk social twitter get_retweets -p "123456789"
# Get users who interacted with a tweet
gsk social twitter get_post_interacting_users -p "123456789" --interaction_type retweeters
# Count posts matching a query
gsk social twitter count_posts -q "AI" --start_date 2026-03-01 --end_date 2026-03-10| Option | Default | Description |
|--------|---------|-------------|
| <action> | — | Action to perform (required, positional) |
| -q, --query <text> | — | Search query, username, or identifier |
| -p, --post_id <id> | — | Tweet/post ID |
| --post_ids <ids> | — | Comma-separated tweet IDs |
| --connection_type <type> | followers | followers or following |
| --interaction_type <type> | retweeters | commenters, quoters, or retweeters |
| --start_date <YYYY-MM-DD> | — | Start date filter |
| --end_date <YYYY-MM-DD> | — | End date filter |
| --language <code> | — | Language filter (e.g., en, zh) |
| --limit <n> | — | Max number of results |
Actions: search_posts, search_users, get_posts_by_author, get_posts_by_ids, get_user, get_user_connections, get_users_by_keywords, get_comments, get_quotes, get_retweets, get_post_interacting_users, count_posts
social instagram
Search and retrieve data from Instagram. 9 actions available.
# Search posts
gsk social instagram search_posts -q "travel photography" --start_date 2026-01-01
# Search users
gsk social instagram search_users -q "natgeo" --limit 5
# Get posts by a specific user
gsk social instagram get_posts_by_user -q "natgeo" --start_date 2026-03-01
# Get posts by IDs
gsk social instagram get_posts_by_ids --post_ids "abc123,def456"
# Get user profile
gsk social instagram get_user -q "natgeo"
# Get followers or following
gsk social instagram get_user_connections -q "natgeo" --connection_type following
# Get users by keywords
gsk social instagram get_users_by_keywords -q "landscape photographer"
# Get comments on a post
gsk social instagram get_comments -p "abc123" --start_date 2026-03-01
# Get users who liked or commented on a post
gsk social instagram get_post_interacting_users -p "abc123" --interaction_type likers| Option | Default | Description |
|--------|---------|-------------|
| <action> | — | Action to perform (required, positional) |
| -q, --query <text> | — | Search query, username, or identifier |
| -p, --post_id <id> | — | Post ID |
| --post_ids <ids> | — | Comma-separated post IDs |
| --connection_type <type> | followers | followers or following |
| --interaction_type <type> | likers | likers or commenters |
| --start_date <YYYY-MM-DD> | — | Start date filter |
| --end_date <YYYY-MM-DD> | — | End date filter |
| --limit <n> | — | Max number of results |
Actions: search_posts, search_users, get_posts_by_user, get_posts_by_ids, get_user, get_user_connections, get_users_by_keywords, get_comments, get_post_interacting_users
social reddit
Search and retrieve data from Reddit. 9 actions available.
# Search posts (with sort and time filters)
gsk social reddit search_posts -q "rust programming" --sort top --time week -s "programming"
# Search comments
gsk social reddit search_comments -q "async await" -s "rust"
# Search users
gsk social reddit search_users -q "spez" --limit 5
# Search subreddits
gsk social reddit search_subreddits -q "machine learning" --limit 10
# Get a post with its comments
gsk social reddit get_post_with_comments -p "1abc2de"
# Get subreddit info with recent posts
gsk social reddit get_subreddit_with_posts -q "programming"
# Get subreddits by keywords
gsk social reddit get_subreddits_by_keywords -q "artificial intelligence"
# Get user profile
gsk social reddit get_user -q "spez"
# Get users by keywords (active in discussions)
gsk social reddit get_users_by_keywords -q "neural networks" -s "MachineLearning"| Option | Default | Description |
|--------|---------|-------------|
| <action> | — | Action to perform (required, positional) |
| -q, --query <text> | — | Search query, username, or subreddit name |
| -p, --post_id <id> | — | Post ID |
| -s, --subreddit <name> | — | Subreddit name filter |
| --sort <order> | — | Sort: relevance, hot, top, new, comments |
| --time <range> | — | Time filter: hour, day, week, month, year, all |
| --start_date <YYYY-MM-DD> | — | Start date filter |
| --end_date <YYYY-MM-DD> | — | End date filter |
| --limit <n> | — | Max number of results |
Actions: search_posts, search_comments, search_users, search_subreddits, get_post_with_comments, get_subreddit_with_posts, get_subreddits_by_keywords, get_user, get_users_by_keywords
Local File Handling
Most commands that accept URLs also accept local file paths. The CLI automatically uploads local files before passing them to the API:
# These are equivalent:
gsk analyze "Describe this" -i ./photo.jpg
gsk img "Enhance this" -i ./photo.png -o ./result.png
gsk video "Animate this" -i ./frame.jpg -o ./video.mp4Use -o / --output-file to save generated results directly to a local file.
Auto-Update
The CLI checks for updates every 4 hours and installs new versions in the background.
To disable auto-update:
# Via environment variable
export GSK_NO_AUTO_UPDATE=1
# Via config file
# Add "auto_update": false to ~/.genspark-tool-cli/config.jsonOutput Conventions
| Stream | Content | Consumer | |--------|---------|----------| | stdout | JSON result | Programs / AI agents | | stderr | Progress, debug, error messages | Human / logs |
This separation allows programs to parse clean JSON from stdout while humans can follow progress on stderr.
Available Models
| Model | Description |
|-------|-------------|
| nano-banana-2 | Gemini 3.1 Flash Image - Fast and efficient with advanced reasoning. Multi-image fusion with up to 14 references. Supports 0.5K-4K resolution |
| fal-ai/gpt-image-1.5 | GPT Image 1.5 - Supports text-to-image and image editing with multi-image input |
| imagen4 | Latest high quality image generation model, upgrade from Imagen 3 |
| recraft-v3 | Realistic image generation model |
| fal-ai/bytedance/seedream/v5/lite | Bytedance Seedream v5 Lite - Text-to-image and image editing with native 2K resolution and excellent text layout |
| fal-ai/flux-2 | Flux 2 - Text-to-image and image editing with enhanced realism and crisp text generation. Supports up to 3 images for edit mode |
| fal-ai/flux-2-pro | Flux 2 Pro - Higher quality version of Flux 2 with professional-grade output |
| fal-ai/z-image/turbo | Z-Image Turbo - Optimized for speed. Good for quick iterations, bulk generation, and style transfer |
| ideogram/V_3 | Ideogram V3 - Character reference specialist with superior facial feature preservation and character consistency |
| qwen-image | Chinese poster specialist with outstanding Chinese text rendering and cultural context mastery |
| bbox-segment | Extract subjects from images based on bounding box region |
| fal-bria-rmbg | Remove background from image |
| fal-ai/recraft-clarity-upscale | Upscale image |
| fal-ai/image-editing/text-removal | Remove text and watermarks from images while preserving background |
| flux-pro/outpaint | Expand image to a specific aspect ratio |
| Model | Capabilities | Aspect Ratios | Duration | Notes |
|-------|-------------|---------------|----------|-------|
| kling/v3 | Text/Image-to-video | 16:9, 9:16, 1:1 | 3-15s | Latest Kling V3 with audio. Pro/Standard quality modes |
| gemini/veo3.1 | Text/Image-to-video | 16:9, 9:16 | 8s | Latest Veo with enhanced quality. Supports fast_mode and hd_mode (1080p) |
| gemini/veo3.1/reference-to-video | Reference-to-video | 16:9, 9:16 | 8s | Generate video using 1+ reference images. Supports fast_mode and hd_mode |
| gemini/veo3.1/first-last-frame-to-video | Frame transition | 16:9, 9:16 | 8s | Precise transitions from first to last frame. Requires exactly 2 images |
| minimax/hailuo-2.3/standard | Text/Image-to-video | 16:9, 9:16 | 6s, 10s | Fast (~4min), cost-effective. Supports first & last frame control |
| wan/v2.6 | Text/Image/Video-to-video | 16:9, 9:16, 1:1, 4:3, 3:4 | 5s, 10s, 15s | 1080p with audio. Supports reference-to-video with 1-3 reference videos |
| vidu/q3 | Text/Image-to-video | 16:9, 9:16, 4:3, 3:4, 1:1 | 1-16s | Enhanced quality with audio generation. Resolution: 720p, 1080p |
| runway/gen4_turbo | Image-to-video | 5:3, 3:5 | 5s, 10s | Fast, high quality. Requires reference image |
| pixverse/v5 | Text/Image-to-video | 16:9, 9:16, 4:3, 1:1, 3:4 | 5s | Fast (~30s). Supports start/end frame transitions |
| fal-ai/bytedance/seedance/v1.5/pro | Text/Image-to-video | 21:9, 16:9, 4:3, 1:1, 3:4, 9:16 | 4-12s | Seedance v1.5 Pro with native audio support. Supports first & last frame control |
| sora-2 | Text/Image/Video-to-video | 16:9, 9:16 | 4s, 8s, 12s | OpenAI Sora 2 for fast, creative videos. Supports video remixing |
| sora-2-pro | Text/Image-to-video | 16:9, 9:16 | 4s, 8s | Sora 2 Pro - Higher fidelity, cinematic quality. 720p and 1080p |
| fal-ai/bytedance-upscaler/upscale/video | Video upscaling | — | — | Upscale existing videos to 2K. Requires video_url parameter |
| xai/grok-imagine-video | Text/Image-to-video | 16:9, 4:3, 1:1, 3:4, 9:16, 21:9, 9:21 | 1-15s | xAI Grok Imagine Video. 720p HD output |
Text-to-Speech (TTS)
| Model | Description |
|-------|-------------|
| google/gemini-2.5-pro-preview-tts | Best, high-quality, realistic TTS. Supports one or multiple speakers with speaker prefixes (e.g., Speaker1: text, Speaker2: text) |
| elevenlabs/v3-tts | Advanced multilingual TTS with multi-speaker dialogue support. Supports emotional tags like [excited], [whispers], [laughs] |
| fal-ai/elevenlabs/tts/multilingual-v2 | High-quality multilingual TTS. Preferred for English |
| fal-ai/minimax/speech-2.8-hd | High-quality multilingual TTS. Preferred for Chinese, Cantonese, Japanese, Korean. One speaker per generation |
Sound Effects
| Model | Description |
|-------|-------------|
| elevenlabs/sound-effects | Sound effect generation. Duration: 0.1-22 seconds |
Music Generation
| Model | Description |
|-------|-------------|
| elevenlabs/music | ElevenLabs music generation with vocals/singing. Lyrics auto-generated (no custom lyrics). Duration: 10s-5min |
| CassetteAI/music-generator | Background music generation. Duration: 10-180 seconds |
| mureka/song-generator | Professional song generation with lyrics. Supports style prompts, reference tracks, vocal and melody inputs. Max: 180s |
| mureka/instrumental-generator | Instrumental music generation without vocals. Supports style prompts and reference tracks. Max: 180s |
| fal-ai/lyria2 | Google Lyria 2 text-to-music. Good for sound effects and lyrics-free music. Max: 30 seconds |
| fal-ai/minimax-music/v2.6 | Song generation with lyrics using MiniMax Music 2.6. Supports markers (Verse), (Chorus), (Bridge), etc. Requires style prompt and lyrics |
Voice Cloning & Transformation
| Model | Description |
|-------|-------------|
| elevenlabs/voice-clone | Clone a voice from audio samples. Returns voice ID for use in TTS generation |
| elevenlabs/voice-changer | Transform audio from one voice to another. Requires source audio and target voice ID |
| fal-ai/minimax/voice-clone | Clone a voice from a sample audio and generate speech from text prompts (gated feature) |
License
MIT
