@getvhs/vhscli
v0.1.18
Published
generate images and videos with ai
Readme
vhscli
vhscli is a small CLI for sending image, video, and chat jobs to VHS.
The client handles auth, input uploads, task creation, polling, and writing output files. Model execution stays on the VHS service, so users do not need provider API keys on their machine.
Quick Start
Run without installing:
npx @getvhs/vhscli@latest login
npx @getvhs/vhscli@latest models
npx @getvhs/vhscli@latest generate seedream-5 "a corgi astronaut riding a bicycle on mars" -o corgi.jpgThe installed binary is vhscli:
vhscli login
vhscli models
vhscli generate gpt-image-2 "a clean app icon for a video tool" -o icon.pngUse - when the prompt should come from stdin:
cat prompt.txt | vhscli generate gpt-image-2 - -o art.png
cat question.txt | vhscli chat - -f paper.pdfRequirements
- Node.js
24or newer (the task store uses the built-innode:sqlite) - local image conversion uses
sipson macOS,magick(ImageMagick) elsewhere ffmpegwhen the requested video output format differs from the source
Auth
vhscli login
vhscli whoami
vhscli logoutlogin opens the browser, listens for the OAuth callback locally, writes the session to ~/.vhs/session.json, and initializes the user's billing account on first login.
whoami prints the session email, falling back to the user id.
logout deletes the local session file.
Commands that need auth will start login if no valid session exists.
Projects and the task store
When the CLI runs inside a project — signalled by the VHS_PROJECTDIR
environment variable (Cue sets it for every command) — generation tasks are
tracked in a per-project SQLite database at:
$VHS_PROJECTDIR/.vhs/gen.dbEach row records the remote task id, the requested and actual output paths
(relative to the project root), the output's mtime / size / sha256, and the
prompt, model, reference inputs, and submitted payload. This replaces the old
<output>.vhs_task sidecar files.
submitandresumerequire a project (they exit ifVHS_PROJECTDIRis unset).submitcreatesgen.dbif needed;resumefails if it is missing.generateworks with or without a project. With one it records the task (so an interrupted run can be finished later withresume <task_id>); without one it keeps the task in memory and writes the output directly.
vhscli chat does not touch the task store.
Commands
vhscli models
Prints the model aliases known to the CLI.
vhscli generate
generate is a command group. Use help to list model subcommands:
vhscli generate --helpAsk a model subcommand for its exact flags:
vhscli generate gpt-image-2 --helpgenerate submits the task, waits for it, and saves the output to -o
(required). Inside a project it also records the task in gen.db, so an
interrupted run can be finished with vhscli resume <task_id>.
vhscli submit <model> [options]
Same models and options as generate, but it submits the task, records it in gen.db, prints the task id (task_id: <uuid>), and exits without waiting. Requires a project.
vhscli submit seedance-2 "a robot dancing in tokyo at night" -o clip.mp4
# prints "task_id: <uuid>" and exits
vhscli resume <task_id>
# waits for the task to finish and writes clip.mp4vhscli resume <ids...>
Finishes one or more submitted generations by their task id (the ids submit printed). For each id, resume waits if the task is still running, then writes the output to the path recorded at submit. It blocks until every task is done. Requires a project and an existing gen.db.
vhscli resume 7d3c1b2a-...
vhscli resume 7d3c1b2a-... 9a8b7c6d-...If the original output directory was deleted, the file is written to the project root under its basename instead. A name conflict gets a Chrome-style (N) suffix, and the actual path is recorded back in gen.db.
The server job keeps running after the local process exits. resume attaches the CLI back to that task and writes the result when it is ready.
vhscli chat <prompt>
Runs one chat request and writes the answer to stdout. It does not create an output file.
vhscli chat "explain how to make sourdough in 5 steps"
vhscli chat "describe this image as json" -i photo.jpg
vhscli chat "summarize this paper in 5 bullets; include page numbers" -f paper.pdf
vhscli chat "list key events with HH:mm:ss timestamps" -v clip.mp4 --fps 2Options:
-i <path>attaches an image. Repeat for multiple images.-f <path>attaches a PDF. Repeat for multiple files.-v <path>attaches one video.--fps <n>samples video from0.2to5frames per second. Default is1.
Image Models
seedream-5
Seedream 5.0 image generation and editing.
vhscli generate seedream-5 "a girl in a yellow raincoat walking under a parasol, monet oil painting style" -o girl.jpg
vhscli generate seedream-5 "remove her hat, keep everything else" -i photo.jpg -o edit.jpgOptions:
-o, --output <path>sets the output path.-i <path>adds a reference image. Maximum14; repeat for more.--size <size>accepts2K,3K, or customWxH. Default is2K.
Custom sizes must pass the model pixel range and aspect ratio checks enforced by the CLI.
seedream-4-5
Seedream 4.5 image generation and editing.
vhscli generate seedream-4-5 "an open refrigerator with milk, eggs, leftover chicken, strawberries; warm light" -o fridge.jpg
vhscli generate seedream-4-5 "swap the dress to red, keep her pose unchanged" -i photo.jpg -o edit.jpgOptions:
-o, --output <path>sets the output path.-i <path>adds a reference image. Maximum14; repeat for more.--size <size>accepts2K,4K, or customWxH. Default is2K.
nano-banana-2
Nano Banana 2 image generation and editing.
vhscli generate nano-banana-2 "remove the man from the photo, keep everything else" -i photo.jpg -o clean.png
vhscli generate nano-banana-2 "a glossy candle in a bell jar on a marble counter, soft light" -o candle.pngOptions:
-o, --output <path>sets the output path.-i <path>adds a reference image. Maximum14; repeat for more.--size <size>accepts512,1K,2K, or4K. Default is1K.
Output aspect ratio is fixed to 1:1.
nano-banana-pro
Nano Banana Pro image generation and editing.
vhscli generate nano-banana-pro "a glossy face moisturizer jar on warm studio backdrop" -o jar.png
vhscli generate nano-banana-pro "a sun-drenched minimalist living room with a 3d armchair from this sketch" -i sketch.jpg -o room.pngOptions:
-o, --output <path>sets the output path.-i <path>adds a reference image. Maximum14; repeat for more.--size <size>accepts1K,2K, or4K. Default is1K.
Output aspect ratio is fixed to 1:1.
gpt-image-2
OpenAI gpt-image-2 image generation and editing.
vhscli generate gpt-image-2 "a children's book drawing of a veterinarian examining a cat" -o vet.png
vhscli generate gpt-image-2 "replace the background with a starry night" -i photo.jpg -o night.pngOptions:
-o, --output <path>sets the output path.-i <path>adds a reference image. Repeat for more.--size <size>accepts a preset or customWxH. Default is1024x1024.
Size presets are 1024x1024, 1536x1024, 1024x1536, 2048x2048, 2048x1152, and 3840x2160.
Custom sizes must use multiples of 16, max edge 3840, total pixels from 655360 to 8294400, and aspect ratio from 1:3 to 3:1.
The -o extension selects the local format: png, jpg, jpeg, or webp.
Video Model
seedance-2
Seedance 2.0 video generation.
vhscli generate seedance-2 "a woman in a red dress walks through a rainy neon-lit alley, slow tracking shot" -o alley.mp4
vhscli generate seedance-2 "animate this photo: gentle pan to the right" --first-frame photo.jpg -o pan.mp4
vhscli generate seedance-2 "match the camera move from this clip in a cyberpunk street" -v ref.mp4 -o cyber.mp4Options:
-o, --output <path>sets the output path.--first-frame <image>uses an image as the first frame.--last-frame <image>uses an image as the last frame. It requires--first-frame.-i <path>adds a reference image. Maximum9. Conflicts with--first-frame.-v <path>adds a reference video. Maximum3; repeat for more.-a <path>adds a reference audio file. Maximum3. Requires at least one-ior-v.--ratio <ratio>accepts16:9,4:3,1:1,3:4,9:16, or21:9. Default is16:9.--resolution <res>accepts480p,720p, or1080p. Default is720p.--duration <n>accepts4to15. Default is5.--audio/--no-audiotoggles the audio track. Default is--audio(audio on); pass--no-audiofor a silent video.--seed <n>sets a random seed.
The command polls until the result is ready and prints progress. To avoid blocking, use vhscli submit seedance-2 ... (same flags) to detach immediately, then vhscli resume <task_id> later. Inside a project, an interrupted generate can also be finished with vhscli resume <task_id> (the id it printed at submit).
Files And Output
-o is required; it sets the output path, interpreted relative to your cwd. Inside a project the path is recorded in gen.db relative to the project root, so the record survives the tree being moved.
Downloads are written through a temporary file and renamed into place. If the requested extension does not match the downloaded media type, conversion happens locally:
- images use
sipson macOS,magick(ImageMagick) elsewhere - videos use
ffmpeg
When resume writes a file whose original directory no longer exists, it falls back to the project root; an existing name then gets a Chrome-style (N) suffix. The path actually written is recorded in gen.db (actual_path), so it can differ from the requested -o.
Input uploads use content hashes for remote object paths. A duplicate upload is treated as success, not failure.
Image type is detected from file content. JPEG and PNG are uploaded as-is; other image formats are converted to JPEG first.
Smoke Tests
smoke/ holds end-to-end tests that hit the live backend with a real
session (they cost money and take minutes — the seedance tests render
video). Each script generates an output, asserts the file is non-empty,
and asks chat to describe it. Run them all with:
npm run smoke # logs + summary under smoke/out/logs/<timestamp>/Project Structure
src/
main.ts # command tree, login/logout/whoami/models, top-level error handler
models.ts # the model registry: flags, payload builders, endpoints, save_result
version.ts
cmd/
chat.ts
resume.ts
lib/
auth.ts # fresh_session — a refreshed access token per request
backend.ts # one named function per vhs edge endpoint (rpc over http)
db.ts # remote task2 helpers (insert_task, get_task) — postgrest
error.ts # Fail + fail() (thrown; main.ts is the only exit point)
gendb.ts # local per-project task store ($VHS_PROJECTDIR/.vhs/gen.db)
http.ts # kfetch (fetch + timeout, precise error messages)
media.ts # save_media (streaming download + convert), detect_mime
parse.ts # kparse (zod parse + readable diagnostic)
process.ts
project.ts # VHS_PROJECTDIR, path helpers, output resolve + stat/hash
prompt.ts
session.ts # interactive login (ensure_login, login)
storage.ts # uploads to supabase storage (upload_images, upload_files)
task.ts # submit/generate/resume orchestration + failure-tolerant polling
schema/ # request/response shapes shared with the backendEach generate model is one entry in src/models.ts: name, endpoint, kind,
flags, help text, and a payload builder. main.ts registers every entry
under both generate and submit with one shared run body, so adding a
model touches a single file. The four simple image models share a factory
and differ only in their size rules.
lib/backend.ts and lib/db.ts are the files that grow as we add
endpoints or columns. Both expose named, schema-validated helpers; callers
never reach raw fetch or PostgREST URLs.
Core Flow
Generation and chat share the same server handoff:
- Parse and validate CLI inputs.
- Upload local media to Supabase Storage via
upload_images/upload_files. - Build the provider payload and validate it against the model schema.
- Call
task.create_and_submit(endpoint, payload, ...), which inserts atask2row (id,user_id,endpoint,payload) and kicks off provider work viabackend.submit2()(POST to/functions/v1/main2/submit2); the call returns immediately. - Long-poll for the result via
wait_for_task, which callsbackend.poll2()— the server holds the request open until a realtime broadcast fires, then returnsresultanderrinline. This is the single poll path for every endpoint; for the t3 video path (which has no provider callback) the server drives the upstream provider poller in the background and broadcasts on completion. - Validate the result shape with
kparseand save or print the output.
The CLI never calls model providers directly. The task2 row is the durable server-side job record; backend.submit2 is the only server entry point for model execution.
Inside a project, the CLI inserts the local gen.db row (keyed by the remote task id) before calling submit2, so an aborted run is always recoverable. vhscli submit stops right after submitting; vhscli resume <task_id> re-attaches via the same wait_for_task helper, reads the final task2 row through db.get_task, saves output through the model's result parser, and writes the path / mtime / size / hash back to gen.db — re-homing the file under the project root if its original directory is gone.
Design Decisions
The runtime is stock Node and TypeScript. There is no Bun dependency and no local workspace package assumption.
The package is ESM-only and targets modern Node. That keeps the runtime model simple and matches current npm behavior.
commander owns command parsing, help text, and option validation.
zod validates request payloads before they are written to task2, then validates service responses before output is saved. Request schemas use .optional() (3rd-party providers handle missing keys vs null differently, so we omit fields we don't set); every other schema (responses, db rows, backend RPC payloads, internal state) uses .nullable().default(null) so the parsed shape is fixed and the CLI only handles one empty case (null).
The CLI is intentionally thin: upload bytes, create a task row, call /submit, wait, and save the result. Provider-specific execution remains server-side.
The CLI is driven by AI agents (cue's nano agent, Claude Code, Codex). There are no automatic retries: a retry decision is the agent's to make, and it makes a better one when handed a precise error than the CLI could make blindly. So every failure exits 1 with a message that names the operation, the endpoint or file, the http status, and the response body — enough for an agent to distinguish "retry this" from "fix the input" from "give up".
What the CLI does own is correctness under restart:
- Every request fetches a fresh access token (
auth.fresh_session), so a token expiring mid-run never kills a long generation. - The
gen.dbrow is inserted beforesubmit2, so an aborted run is always resumable withvhscli resume <task_id>— including a wait that died from a network blip; the server-side job keeps running. - Re-runs converge instead of erroring:
task2inserts and storage uploads are keyed by task id / content hash and treat duplicate-key409as success. - Downloads stream to a unique temp file next to the destination and rename into place; partial downloads and failed conversions are cleaned up, never left as the output.
Errors are not hidden. Expected failures throw Fail, which prints a short lowercase message and exits with status 1. Unexpected errors keep their stack — that is the bug report. main.ts is the single exit point (it sets process.exitCode rather than calling process.exit, so piped stdout always flushes), and library code never exits the process.
Auth state stays in ~/.vhs/session.json so this package can share existing VHS sessions.
