slidesmith
v0.2.16
Published
向导式 PPT 图像生成工具:母版 → 分页 → 多轮候选 → 导出 → 4K 升清
Readme
Slidesmith
A guided PPT image generation tool. Connects "master exploration → content pagination → multi-round candidates → export → 4K upscale" into a unified browser interface. Zero Python dependencies, one command to start.
Architecture
- Next.js 14 (App Router) + React 18 + Tailwind + Zustand
- Backend
app/api/*supports apimart.ai and OpenRouter for image generation; API Key is used server-side only - SSE streams concurrent task progress in real time; sessions persisted via browser localStorage
- Each session gets its own
public/generated/{sessionId}/directory (master/series/upscale/refs/_cache/)
Installation & Usage
Option A · CLI tool (recommended)
npm install -g slidesmith # after publishing
# or local dev: cd web && npm install && npm link
slidesmith # first run: syncs source, installs deps, builds; fast on subsequent runs
slidesmith dev # developer mode (hot reload)
slidesmith build # build only, no server
slidesmith sync # force re-sync workspace (troubleshooting)
slidesmith workspace # print workspace path
slidesmith --port 4000
slidesmith --no-open
slidesmith --helpUninstall: npm uninstall -g slidesmith (or npm unlink -g slidesmith if installed via link).
How it works (Option A · workspace mode)
The global slidesmith package only contains core metadata. Runtime dependencies, build artifacts, and session data live in the user's home directory — nothing is written into the system-level node_modules/:
~/.slidesmith/
├── workspace/ # mirrored source + node_modules + .next + public/generated
│ ├── app/ components/ lib/ store/ ...
│ ├── node_modules/ # includes devDependencies required for next build
│ ├── .next/ # build output
│ └── public/generated/{sessionId}/ # session data
└── version.txt # version of the last syncOn first launch it automatically:
- Mirrors the package source to
~/.slidesmith/workspace/ - Runs
npm install --include=devin the workspace (~30–60s) - Builds and starts Next.js
Upgrading: after npm install -g slidesmith@latest, the next launch detects the version change and re-syncs automatically. Session data (public/generated/) and custom design files (public/designs/*.DESIGN.md) are never overwritten.
Option B · Classic dev mode
cd web
npm install
npm run dev # http://localhost:3000Production build:
npm run build && npm startAPI Key & Endpoint Configuration
Two options:
- Environment variable (
.env.localin the workspace):IMAGEN_API_KEY=sk-... # Optional — defaults to apimart.ai + gpt-image-2 # IMAGEN_BASE_URL=https://api.apimart.ai/v1 # IMAGEN_MODEL=gpt-image-2 - In-browser config: on first load a banner prompts you to configure. Click the Model Settings indicator in the top bar to open the dialog. Choose a provider from the dropdown (apimart.ai or OpenRouter) or enter a custom Base URL and Model manually. Settings are written to
.env.localand take effect immediately — no restart needed.
Supported providers out of the box:
| Provider | Base URL | Models |
|---|---|---|
| apimart.ai (default) | https://api.apimart.ai/v1 | gpt-image-2 |
| OpenRouter | https://openrouter.ai/api/v1 | openai/gpt-5.4-image-2, google/gemini-3.1-flash-image-preview |
Workflow
- Step 1 · Master: paste or load
DESIGN.md, describe the PPT theme, upload a logo, choose the number of candidates (default 6) → generate in parallel → click to select one. - Step 2 · Upload: paste or load
slide_content.md, configure the slide regex (default^## Slide (\d+)), preview the per-slide split in real time. Global reference images (affecting all slides) can be enabled/disabled here — hover a card to see the action hint. - Step 3 · Generate: set rounds per slide (default 3) and generate candidates in parallel. Each column is one slide, each row is one candidate; click to select the best one. Before generating, configure a global hint or reference images to apply to all slides.
- Step 4 · Export 1K: confirm the selected image for each slide, click "Download ZIP" to get the 1K version.
- Step 5 · 4K Upscale: re-generate each selected 1K image at 4K using the same prompt. Side-by-side preview + download all as ZIP.
In the history sidebar, click any session row to switch to it. The ✎ / ✕ buttons at the end of each row rename or delete the session.
API Reference
| Route | Purpose |
|---|---|
| POST /api/session | Create directory for a sessionId |
| POST /api/upload | Multipart upload (logo / reference images) → public/generated/{sid}/refs/ |
| POST /api/split | {md, pattern} → {sections: [{id, title, content}]} |
| POST /api/generate | Submit a batch of subtasks by stage, returns batchId |
| GET /api/job/:id (SSE) | Subscribe to real-time batch progress |
| POST /api/export | ZIP the selected images |
| POST /api/export-pptx | Export directly as .pptx |
Export File Naming
ZIP / PPTX files are named slidesmith_<resolution>_<sessionId>.zip (e.g. slidesmith_1k_xxx.zip, slidesmith_4k_xxx.pptx). Files from older versions with a pptgen_* prefix use the same format — only the prefix differs.
Session Data & Runtime Directory
- Each session lives in
public/generated/{sessionId}/(master/series/upscale/refs/_cache/). Not tracked by Git, not bundled into the npm package. - With the global install, the runtime directory is
~/.slidesmith/workspace/public/generated/. Withnpm run devit isweb/public/generated/. - The browser localStorage key is
pptgen-session(kept as-is for backward compatibility with existing sessions).
Image Client
lib/imagen.ts supports two generation paths:
- apimart.ai — polling-based (
/images/generations): 15s initial wait + 4s interval + 600s timeout, with error classification and retry backoff. - OpenRouter — chat completions-based (
/chat/completions): sends refs asimage_urlmessage parts, extracts the generated image from the response.
lib/assets.ts encodes local reference images (PNG / JPG only) as base64 data URIs, with SHA1-keyed disk cache to avoid re-compressing on every generation. lib/content.ts handles per-slide splitting of slide_content.md.
