@getvhs/vhscli

v0.1.18

Published

8 hours ago

generate images and videos with ai

0High
0Medium
0Low

leizha0

vhscli

vhscli is a small CLI for sending image, video, and chat jobs to VHS.

The client handles auth, input uploads, task creation, polling, and writing output files. Model execution stays on the VHS service, so users do not need provider API keys on their machine.

Quick Start

Run without installing:

npx @getvhs/vhscli@latest login
npx @getvhs/vhscli@latest models
npx @getvhs/vhscli@latest generate seedream-5 "a corgi astronaut riding a bicycle on mars" -o corgi.jpg

The installed binary is vhscli:

vhscli login
vhscli models
vhscli generate gpt-image-2 "a clean app icon for a video tool" -o icon.png

Use - when the prompt should come from stdin:

cat prompt.txt | vhscli generate gpt-image-2 - -o art.png
cat question.txt | vhscli chat - -f paper.pdf

Requirements

Node.js 24 or newer (the task store uses the built-in node:sqlite)
local image conversion uses sips on macOS, magick (ImageMagick) elsewhere
ffmpeg when the requested video output format differs from the source

Auth

vhscli login
vhscli whoami
vhscli logout

login opens the browser, listens for the OAuth callback locally, writes the session to ~/.vhs/session.json, and initializes the user's billing account on first login.

whoami prints the session email, falling back to the user id.

logout deletes the local session file.

Commands that need auth will start login if no valid session exists.

Projects and the task store

When the CLI runs inside a project — signalled by the VHS_PROJECTDIR environment variable (Cue sets it for every command) — generation tasks are tracked in a per-project SQLite database at:

$VHS_PROJECTDIR/.vhs/gen.db

Each row records the remote task id, the requested and actual output paths (relative to the project root), the output's mtime / size / sha256, and the prompt, model, reference inputs, and submitted payload. This replaces the old <output>.vhs_task sidecar files.

submit and resume require a project (they exit if VHS_PROJECTDIR is unset). submit creates gen.db if needed; resume fails if it is missing.
generate works with or without a project. With one it records the task (so an interrupted run can be finished later with resume <task_id>); without one it keeps the task in memory and writes the output directly.

vhscli chat does not touch the task store.

Commands

`vhscli models`

Prints the model aliases known to the CLI.

`vhscli generate`

generate is a command group. Use help to list model subcommands:

vhscli generate --help

Ask a model subcommand for its exact flags:

vhscli generate gpt-image-2 --help

generate submits the task, waits for it, and saves the output to -o (required). Inside a project it also records the task in gen.db, so an interrupted run can be finished with vhscli resume <task_id>.

`vhscli submit <model> [options]`

Same models and options as generate, but it submits the task, records it in gen.db, prints the task id (task_id: <uuid>), and exits without waiting. Requires a project.

vhscli submit seedance-2 "a robot dancing in tokyo at night" -o clip.mp4
# prints "task_id: <uuid>" and exits

vhscli resume <task_id>
# waits for the task to finish and writes clip.mp4

`vhscli resume <ids...>`

Finishes one or more submitted generations by their task id (the ids submit printed). For each id, resume waits if the task is still running, then writes the output to the path recorded at submit. It blocks until every task is done. Requires a project and an existing gen.db.

vhscli resume 7d3c1b2a-...
vhscli resume 7d3c1b2a-... 9a8b7c6d-...

If the original output directory was deleted, the file is written to the project root under its basename instead. A name conflict gets a Chrome-style (N) suffix, and the actual path is recorded back in gen.db.

The server job keeps running after the local process exits. resume attaches the CLI back to that task and writes the result when it is ready.

`vhscli chat <prompt>`

Runs one chat request and writes the answer to stdout. It does not create an output file.

vhscli chat "explain how to make sourdough in 5 steps"
vhscli chat "describe this image as json" -i photo.jpg
vhscli chat "summarize this paper in 5 bullets; include page numbers" -f paper.pdf
vhscli chat "list key events with HH:mm:ss timestamps" -v clip.mp4 --fps 2

Options:

-i <path> attaches an image. Repeat for multiple images.
-f <path> attaches a PDF. Repeat for multiple files.
-v <path> attaches one video.
--fps <n> samples video from 0.2 to 5 frames per second. Default is 1.

Image Models

`seedream-5`

Seedream 5.0 image generation and editing.

vhscli generate seedream-5 "a girl in a yellow raincoat walking under a parasol, monet oil painting style" -o girl.jpg
vhscli generate seedream-5 "remove her hat, keep everything else" -i photo.jpg -o edit.jpg

Options:

-o, --output <path> sets the output path.
-i <path> adds a reference image. Maximum 14; repeat for more.
--size <size> accepts 2K, 3K, or custom WxH. Default is 2K.

Custom sizes must pass the model pixel range and aspect ratio checks enforced by the CLI.

`seedream-4-5`

Seedream 4.5 image generation and editing.

vhscli generate seedream-4-5 "an open refrigerator with milk, eggs, leftover chicken, strawberries; warm light" -o fridge.jpg
vhscli generate seedream-4-5 "swap the dress to red, keep her pose unchanged" -i photo.jpg -o edit.jpg

Options:

-o, --output <path> sets the output path.
-i <path> adds a reference image. Maximum 14; repeat for more.
--size <size> accepts 2K, 4K, or custom WxH. Default is 2K.

`nano-banana-2`

Nano Banana 2 image generation and editing.

vhscli generate nano-banana-2 "remove the man from the photo, keep everything else" -i photo.jpg -o clean.png
vhscli generate nano-banana-2 "a glossy candle in a bell jar on a marble counter, soft light" -o candle.png

Options:

-o, --output <path> sets the output path.
-i <path> adds a reference image. Maximum 14; repeat for more.
--size <size> accepts 512, 1K, 2K, or 4K. Default is 1K.

Output aspect ratio is fixed to 1:1.

`nano-banana-pro`

Nano Banana Pro image generation and editing.

vhscli generate nano-banana-pro "a glossy face moisturizer jar on warm studio backdrop" -o jar.png
vhscli generate nano-banana-pro "a sun-drenched minimalist living room with a 3d armchair from this sketch" -i sketch.jpg -o room.png

Options:

-o, --output <path> sets the output path.
-i <path> adds a reference image. Maximum 14; repeat for more.
--size <size> accepts 1K, 2K, or 4K. Default is 1K.

Output aspect ratio is fixed to 1:1.

`gpt-image-2`

OpenAI gpt-image-2 image generation and editing.

vhscli generate gpt-image-2 "a children's book drawing of a veterinarian examining a cat" -o vet.png
vhscli generate gpt-image-2 "replace the background with a starry night" -i photo.jpg -o night.png

Options:

-o, --output <path> sets the output path.
-i <path> adds a reference image. Repeat for more.
--size <size> accepts a preset or custom WxH. Default is 1024x1024.

Size presets are 1024x1024, 1536x1024, 1024x1536, 2048x2048, 2048x1152, and 3840x2160.

Custom sizes must use multiples of 16, max edge 3840, total pixels from 655360 to 8294400, and aspect ratio from 1:3 to 3:1.

The -o extension selects the local format: png, jpg, jpeg, or webp.

Video Model

`seedance-2`

Seedance 2.0 video generation.

vhscli generate seedance-2 "a woman in a red dress walks through a rainy neon-lit alley, slow tracking shot" -o alley.mp4
vhscli generate seedance-2 "animate this photo: gentle pan to the right" --first-frame photo.jpg -o pan.mp4
vhscli generate seedance-2 "match the camera move from this clip in a cyberpunk street" -v ref.mp4 -o cyber.mp4

Options:

-o, --output <path> sets the output path.
--first-frame <image> uses an image as the first frame.
--last-frame <image> uses an image as the last frame. It requires --first-frame.
-i <path> adds a reference image. Maximum 9. Conflicts with --first-frame.
-v <path> adds a reference video. Maximum 3; repeat for more.
-a <path> adds a reference audio file. Maximum 3. Requires at least one -i or -v.
--ratio <ratio> accepts 16:9, 4:3, 1:1, 3:4, 9:16, or 21:9. Default is 16:9.
--resolution <res> accepts 480p, 720p, or 1080p. Default is 720p.
--duration <n> accepts 4 to 15. Default is 5.
--audio / --no-audio toggles the audio track. Default is --audio (audio on); pass --no-audio for a silent video.
--seed <n> sets a random seed.

The command polls until the result is ready and prints progress. To avoid blocking, use vhscli submit seedance-2 ... (same flags) to detach immediately, then vhscli resume <task_id> later. Inside a project, an interrupted generate can also be finished with vhscli resume <task_id> (the id it printed at submit).

Files And Output

-o is required; it sets the output path, interpreted relative to your cwd. Inside a project the path is recorded in gen.db relative to the project root, so the record survives the tree being moved.

Downloads are written through a temporary file and renamed into place. If the requested extension does not match the downloaded media type, conversion happens locally:

images use sips on macOS, magick (ImageMagick) elsewhere
videos use ffmpeg

When resume writes a file whose original directory no longer exists, it falls back to the project root; an existing name then gets a Chrome-style (N) suffix. The path actually written is recorded in gen.db (actual_path), so it can differ from the requested -o.

Input uploads use content hashes for remote object paths. A duplicate upload is treated as success, not failure.

Image type is detected from file content. JPEG and PNG are uploaded as-is; other image formats are converted to JPEG first.

Smoke Tests

smoke/ holds end-to-end tests that hit the live backend with a real session (they cost money and take minutes — the seedance tests render video). Each script generates an output, asserts the file is non-empty, and asks chat to describe it. Run them all with:

npm run smoke    # logs + summary under smoke/out/logs/<timestamp>/

Project Structure

src/
  main.ts         # command tree, login/logout/whoami/models, top-level error handler
  models.ts       # the model registry: flags, payload builders, endpoints, save_result
  version.ts
  cmd/
    chat.ts
    resume.ts
  lib/
    auth.ts       # fresh_session — a refreshed access token per request
    backend.ts    # one named function per vhs edge endpoint (rpc over http)
    db.ts         # remote task2 helpers (insert_task, get_task) — postgrest
    error.ts      # Fail + fail() (thrown; main.ts is the only exit point)
    gendb.ts      # local per-project task store ($VHS_PROJECTDIR/.vhs/gen.db)
    http.ts       # kfetch (fetch + timeout, precise error messages)
    media.ts      # save_media (streaming download + convert), detect_mime
    parse.ts      # kparse (zod parse + readable diagnostic)
    process.ts
    project.ts    # VHS_PROJECTDIR, path helpers, output resolve + stat/hash
    prompt.ts
    session.ts    # interactive login (ensure_login, login)
    storage.ts    # uploads to supabase storage (upload_images, upload_files)
    task.ts       # submit/generate/resume orchestration + failure-tolerant polling
    schema/       # request/response shapes shared with the backend

Each generate model is one entry in src/models.ts: name, endpoint, kind, flags, help text, and a payload builder. main.ts registers every entry under both generate and submit with one shared run body, so adding a model touches a single file. The four simple image models share a factory and differ only in their size rules.

lib/backend.ts and lib/db.ts are the files that grow as we add endpoints or columns. Both expose named, schema-validated helpers; callers never reach raw fetch or PostgREST URLs.

Core Flow

Generation and chat share the same server handoff:

Parse and validate CLI inputs.
Upload local media to Supabase Storage via upload_images / upload_files.
Build the provider payload and validate it against the model schema.
Call task.create_and_submit(endpoint, payload, ...), which inserts a task2 row (id, user_id, endpoint, payload) and kicks off provider work via backend.submit2() (POST to /functions/v1/main2/submit2); the call returns immediately.
Long-poll for the result via wait_for_task, which calls backend.poll2() — the server holds the request open until a realtime broadcast fires, then returns result and err inline. This is the single poll path for every endpoint; for the t3 video path (which has no provider callback) the server drives the upstream provider poller in the background and broadcasts on completion.
Validate the result shape with kparse and save or print the output.

The CLI never calls model providers directly. The task2 row is the durable server-side job record; backend.submit2 is the only server entry point for model execution.

Inside a project, the CLI inserts the local gen.db row (keyed by the remote task id) before calling submit2, so an aborted run is always recoverable. vhscli submit stops right after submitting; vhscli resume <task_id> re-attaches via the same wait_for_task helper, reads the final task2 row through db.get_task, saves output through the model's result parser, and writes the path / mtime / size / hash back to gen.db — re-homing the file under the project root if its original directory is gone.

Design Decisions

The runtime is stock Node and TypeScript. There is no Bun dependency and no local workspace package assumption.

The package is ESM-only and targets modern Node. That keeps the runtime model simple and matches current npm behavior.

commander owns command parsing, help text, and option validation.

zod validates request payloads before they are written to task2, then validates service responses before output is saved. Request schemas use .optional() (3rd-party providers handle missing keys vs null differently, so we omit fields we don't set); every other schema (responses, db rows, backend RPC payloads, internal state) uses .nullable().default(null) so the parsed shape is fixed and the CLI only handles one empty case (null).

The CLI is intentionally thin: upload bytes, create a task row, call /submit, wait, and save the result. Provider-specific execution remains server-side.

The CLI is driven by AI agents (cue's nano agent, Claude Code, Codex). There are no automatic retries: a retry decision is the agent's to make, and it makes a better one when handed a precise error than the CLI could make blindly. So every failure exits 1 with a message that names the operation, the endpoint or file, the http status, and the response body — enough for an agent to distinguish "retry this" from "fix the input" from "give up".

What the CLI does own is correctness under restart:

Every request fetches a fresh access token (auth.fresh_session), so a token expiring mid-run never kills a long generation.
The gen.db row is inserted before submit2, so an aborted run is always resumable with vhscli resume <task_id> — including a wait that died from a network blip; the server-side job keeps running.
Re-runs converge instead of erroring: task2 inserts and storage uploads are keyed by task id / content hash and treat duplicate-key 409 as success.
Downloads stream to a unique temp file next to the destination and rename into place; partial downloads and failed conversions are cleaned up, never left as the output.

Errors are not hidden. Expected failures throw Fail, which prints a short lowercase message and exits with status 1. Unexpected errors keep their stack — that is the bug report. main.ts is the single exit point (it sets process.exitCode rather than calling process.exit, so piped stdout always flushes), and library code never exits the process.

Auth state stays in ~/.vhs/session.json so this package can share existing VHS sessions.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

vhscli

Quick Start

Requirements

Auth

Projects and the task store

Commands

vhscli models

vhscli generate

vhscli submit <model> [options]

vhscli resume <ids...>

vhscli chat <prompt>

Image Models

seedream-5

seedream-4-5

nano-banana-2

nano-banana-pro

gpt-image-2

Video Model

seedance-2

Files And Output

Smoke Tests

Project Structure

Core Flow

Design Decisions

`vhscli models`

`vhscli generate`

`vhscli submit <model> [options]`

`vhscli resume <ids...>`

`vhscli chat <prompt>`

`seedream-5`

`seedream-4-5`

`nano-banana-2`

`nano-banana-pro`

`gpt-image-2`

`seedance-2`