bailian-ai-toolkit
v1.0.1
Published
Claude Code Skill - Aliyun Bailian (百炼) AI Toolkit. One-line commands for image/video generation/editing, TTS, ASR, vision, chat, web search, and file upload via DashScope.
Downloads
61
Maintainers
Readme
🚀 Bailian AI Toolkit
Aliyun Model Studio (DashScope) CLI Toolkit — One-line commands for all AI generation tasks.
bailian-ai-toolkit is a Claude Code / OpenClaw skill that wraps the Aliyun Bailian (百炼) / DashScope AI platform into a single, unified CLI. Generate images, edit videos, synthesize speech, recognize voice, run vision understanding, chat with LLMs, search the web, and upload files — all with one-line commands.
✨ Features
| Category | Capability | Models |
|----------|-----------|--------|
| 🎨 Image | Text-to-image, image editing, style transfer | qwen-image-2.0 |
| 🎬 Video | Text-to-video, image-to-video, video editing | happyhorse-1.0-t2v, happyhorse-1.0-video-edit |
| 🗣️ TTS | Text-to-speech with 50+ voices | cosyvoice-v3-flash |
| 👂 ASR | Speech recognition, speaker diarization | fun-asr |
| 👁️ Vision | Image/video understanding, OCR | qwen-vl-max |
| 💬 Chat | Multi-model text generation | qwen3.6-plus, qwen3.6-max, deepseek-v3 |
| 🌐 Search | Web search with AI summarization | DashScope MCP |
| 📤 Upload | File upload to OSS (48h expiry) | OSS |
📦 Installation
1. Install the CLI
npm install -g bailian-cli2. Configure API Key
# Get your API key from: https://bailian.console.aliyun.com/
bl auth login --api-key sk-xxxxxxxxxxxx
# Verify
bl auth status --output json3. Install the Skill (for Claude Code / OpenClaw)
Copy SKILL.md to your skills directory, or use the skill manager.
🚀 Quick Start
# 🎨 Generate an image
bl image generate --prompt "A cyberpunk city at night" --n 4 --out-dir ./out/
# 🎬 Create a video
bl video generate --prompt "Ocean waves at sunset" --download sunset.mp4
# 🗣️ Synthesize speech
bl speech synthesize --text "你好,世界" --voice <voice_id> --out hello.mp3
# 👂 Transcribe audio
bl speech recognize --url ./meeting.mp3 --language zh
# 👁️ Understand an image
bl vision describe --image ./photo.jpg
# 💬 Chat with LLM
bl text chat --message "Explain quantum computing in simple terms"
# 🌐 Search the web
bl search web --query "Latest AI trends 2026"
# 📤 Upload a file
bl file upload ./document.pdf📖 Detailed Usage
🎨 Image Generation
# Basic generation (4 images, 1:1 aspect ratio)
bl image generate --prompt "一只穿太空服的猫在火星上漫步" --n 4 --out-dir ./out/
# Custom size
bl image generate --prompt "..." --size 16:9 --n 2 --out-dir ./wallpapers/
# Custom output prefix
bl image generate --prompt "..." --n 6 --out-dir ./images/ --out-prefix my-image
# Disable automatic prompt expansion
bl image generate --prompt "..." --no-prompt-extend✂️ Image Editing
# Background replacement
bl image edit --image ./photo.png --prompt "把背景换成海滩日落"
# Multi-image compositing
bl image edit --image ./a.png --image ./b.png --prompt "把两张图合并成一张拼图"
# Object removal
bl image edit --image ./photo.png --prompt "移除背景中的人物"
# Style transfer
bl image edit --image ./photo.png --prompt "转换为水彩画风格"🎬 Video Generation
# Text-to-video
bl video generate --prompt "海边日落的延时摄影" --download sunset.mp4
# Image-to-video
bl video generate --image ./photo.jpg --prompt "让画面动起来" --download animated.mp4
# Video editing (style transfer)
bl video edit --video ./input.mp4 --prompt "转换为动漫风格" --download anime.mp4
# Video editing (object replacement)
bl video edit --video ./input.mp4 --prompt "替换衣服" --ref-image ./clothes.png
# Async mode (don't wait for completion)
bl video generate --prompt "..." --no-wait🗣️ Speech Synthesis (TTS)
# List available voices
bl speech synthesize --list-voices --model cosyvoice-v3-flash
# Chinese TTS
bl speech synthesize --text "你好,欢迎使用阿里云百炼" --voice <voice_id> --out output.mp3
# English TTS
bl speech synthesize --text "Hello, welcome to Alibaba Cloud" --voice <voice_id> --language en --out output.mp3👂 Speech Recognition (ASR)
# Basic recognition (local file)
bl speech recognize --url ./meeting.mp3 --language zh
# Speaker diarization
bl speech recognize --url ./meeting.wav --diarization --speaker-count 3
# URL-based recognition
bl speech recognize --url https://example.com/audio.mp3👁️ Vision Understanding
# Describe an image
bl vision describe --image ./photo.jpg
# Ask a specific question about an image
bl vision describe --image ./screenshot.png --prompt "这张截图中有哪些按钮?"
# Video understanding
bl vision describe --video ./clip.mp4 --prompt "总结视频内容"💬 Text Chat
# Basic chat
bl text chat --message "用中文写一首诗"
# Multi-turn conversation
bl text chat --message "..." --conversation-id <id>
# Specify model
bl text chat --message "..." --model qwen3.6-max🌐 Web Search
bl search web --query "2026年AI发展趋势" --count 10📤 File Upload
bl file upload ./document.pdf
# Returns an OSS URL valid for 48 hours🛒 E-Commerce Product Image Template
# Front view - white background
bl image generate `
--prompt "纯黑色男士短袖T恤,亚马逊电商主图,白色背景,专业产品摄影,正面平铺展示" `
--n 2 --size 1:1 --out-dir ./product/ --out-prefix front
# Back view
bl image generate `
--prompt "纯黑色男士短袖T恤,亚马逊电商主图,白色背景,背面平铺展示" `
--n 2 --size 1:1 --out-dir ./product/ --out-prefix back
# Model wearing
bl image generate `
--prompt "年轻男模穿着纯黑色T恤,亚马逊电商主图,白色背景,专业时尚摄影" `
--n 2 --size 1:1 --out-dir ./product/ --out-prefix model
# Detail close-up
bl image generate `
--prompt "纯黑色男士T恤领口细节特写,面料纹理清晰,微距摄影" `
--n 2 --size 1:1 --out-dir ./product/ --out-prefix detail
# Lifestyle scene
bl image generate `
--prompt "年轻男士穿着纯黑色T恤在户外咖啡馆,自然光线,生活方式摄影" `
--n 2 --size 1:1 --out-dir ./product/ --out-prefix lifestyle⚙️ Global Options
| Flag | Description |
|------|-------------|
| --api-key sk-xxx | Override API Key for this command |
| --region cn/us/intl | Region (default: cn) |
| --output json/text | Output format |
| --quiet | Minimal output |
| --non-interactive | Disable interactive prompts |
| --help | Show command help |
🔑 Authentication
# Login with API Key
bl auth login --api-key sk-xxxxxxxxxxxx
# Check status
bl auth status --output json
# Update CLI
bl update🔑 Get your API Key: Aliyun Bailian Console
📋 Key Rules
- Local files work directly — Pass local paths, the CLI auto-uploads for you
- Never hardcode API keys — Use
--api-keyor environment variables - Check
--helpfirst — Every command has comprehensive built-in help - Image generation expands prompts — Use
--no-prompt-extendto disable - Video generation is async — CLI waits by default; use
--no-waitfor fire-and-forget
📁 File Structure
bailian-ai-toolkit/
├── SKILL.md # Claude Code / OpenClaw skill definition
├── README.md # This file
└── reference/ # Full command reference (in bailian-cli skill)🤝 Contributing
Issues and PRs welcome! This is a skill wrapper — the core CLI is at bailian-cli.
📄 License
MIT © 2025
Built with ❤️ for the AI developer community
