@mcptoolshop/logo-studio
v1.0.0
Published
Semantic product mining, encoder-aware prompt compilation, and diffusion logo generation. Zero dependencies.
Downloads
558
Maintainers
Readme
What is this?
Logo Studio takes your product name and description, figures out what makes it visually distinct, and generates professional brand marks using diffusion models.
Most AI logo tools are either template engines wearing an AI mask, or raw diffusion playgrounds that dump the prompt burden on you. Logo Studio is the missing middle: it understands your product, speaks the language that image encoders actually respond to, and handles the token budgeting so you don't have to.
The pipeline:
Product text → Mine visual anchors → Compile encoder-aware prompts → Generate images
(LLM extraction) (CLIP/T5 token budgeting) (ComfyUI / Together / diffusers)Zero production dependencies. The entire package — including a full CLIP BPE tokenizer — ships as pure TypeScript with no external npm packages. Install it and it works.
Install
npm install @mcptoolshop/logo-studioThree ways to use it
1. Studio mode — let the AI handle everything
Feed it your product info. The mining engine extracts visual metaphors, an LLM writes a cohesive prompt, and the compiler token-budgets it for the image encoder.
import { mine, compile, generate } from '@mcptoolshop/logo-studio';
const mined = await mine({
name: 'Acme Corp',
description: 'We build bridges between legacy systems and modern cloud infrastructure.',
industry: 'enterprise software',
});
// The LLM wrote a visual brief — the compiler uses it directly
const prompts = compile(mined, { promptMode: 'studio' });
const result = await generate(prompts[0], {
backend: 'comfyui',
seed: 42,
});2. Raw mode — you write the prompt
Already know what you want? Write your own SDXL prompt. The pipeline adds negatives and handles the backend plumbing.
import { compile, generate } from '@mcptoolshop/logo-studio';
const prompts = compile(
{ productName: 'Acme', anchors: [], suggestedLogoType: 'symbol', suggestedStyle: '', suggestedPalette: [] },
{
promptMode: 'raw',
rawPrompt: 'geometric shield emblem, interlocking AC monogram, deep navy and burnished gold, clean edges, black background',
}
);
const result = await generate(prompts[0], { backend: 'comfyui' });3. CLI — one command
# Full pipeline
logo-studio generate \
--name "Acme Corp" \
--description "We build bridges between legacy systems and modern cloud" \
--backend comfyui \
--variants 4 \
--output ./logos/
# See what the compiler produces (no generation)
logo-studio dry-run --name "Acme" --description "We build bridges..."
# Inspect how CLIP tokenizes your text
logo-studio tokenize "geometric badge with mountain motif, navy and gold"How the pipeline works
Mining
The mining engine sends your product text to an LLM (Anthropic or OpenAI) and asks it to think like a senior brand designer. It extracts:
- Visual anchors — physical objects that represent your product (a bridge, a prism, interlocking gears)
- A visual brief — a complete, cohesive SDXL prompt (40-65 words) written as a single visual story
- Style and palette suggestions — based on industry and product positioning
The key insight: instead of outputting fragmented 2-3 word phrases that get glued together with commas, the LLM writes the actual prompt as one coherent piece. This produces dramatically better results than any keyword-stuffing approach.
Compilation
The compiler takes the mining output and prepares it for the image encoder:
- Token budgeting — CLIP has a hard cap of 77 tokens (75 usable). The compiler keeps Tier A under 72 to leave headroom.
- Visual brief validation — checks that the LLM's brief contains at least one Tier 1 physical object, stays within word count limits, and avoids banned words. Invalid briefs fall back to anchor assembly automatically.
- Banned word sanitization — strips words like "logo", "badge", "icon", "emblem" that pull from the cheap end of CLIP's training distribution.
- Checkpoint-aware negatives — FLUX, SDXL, and DreamShaper each have different failure modes. Pass a checkpoint name and the compiler adds the right suppressions.
- Negative prompt assembly — standard negatives for text suppression, photorealism prevention, and quality guards.
- Dual-tier output — Tier A (CLIP, ≤72 tokens) for subject and style, optional Tier B (T5, ≤250 tokens) for enrichment.
The compiler ships a full CLIP BPE tokenizer built from scratch in TypeScript. No Python, no WASM, no external calls. Token counts are exact, not estimated.
Generation
Three pluggable backends:
| Backend | What you need | Best for |
|---------|--------------|----------|
| ComfyUI | Running ComfyUI server (local or remote) | Full control, local GPU, SDXL/FLUX |
| Together AI | TOGETHER_API_KEY env var | Quick cloud generation, no GPU needed |
| diffusers | Python + NVIDIA GPU + diffusers installed | Research, fine-tuned models |
Governor (opt-in)
The governor is a token-based resource controller that prevents runaway costs:
- Budget tracking — caps total spend across mining + generation
- VRAM throttling — reads nvidia-smi, scales resolution down when memory is tight
- Warmup detection — waits for ComfyUI to load models before sending requests
- Circuit breaker — classifies failures (OOM, timeout, API error) and backs off automatically
- Cost estimation — preview what a run will cost before committing
import { createGovernor, mine, generate } from '@mcptoolshop/logo-studio';
const governor = createGovernor({ budget: 200, onBudgetExhausted: 'scale' });
const safeMine = governor.wrapMine(mine);
const safeGenerate = governor.wrapGenerate(generate);
// Same API — governor handles throttling transparently
const mined = await safeMine({ name: 'Acme', description: '...' });Environment variables
| Variable | Required | Default | Used by |
|----------|----------|---------|---------|
| ANTHROPIC_API_KEY | For mining | — | Mining engine (LLM extraction) |
| OPENAI_API_KEY | For mining (alternative) | — | Mining engine (LLM extraction) |
| TOGETHER_API_KEY | For Together backend | — | Together AI generation |
| COMFYUI_URL | No | http://127.0.0.1:8188 | ComfyUI backend |
You only need keys for the parts you use. Running ComfyUI locally with raw prompts needs zero API keys.
ComfyUI setup
If you're using the ComfyUI backend (recommended for local GPU):
- Install ComfyUI
- Download an SDXL checkpoint (e.g., Juggernaut XL)
- Start ComfyUI:
python main.py - Logo Studio connects to
http://127.0.0.1:8188by default
The SDXL workflow uses: CheckpointLoaderSimple → CLIPTextEncode (pos/neg) → EmptyLatentImage → KSampler (DPM++ 2M Karras, 30 steps, CFG 5.0) → VAEDecode → SaveImage.
CLI reference
logo-studio generate Generate logo images from a product description
logo-studio tokenize Analyze CLIP token usage for a prompt
logo-studio dry-run Show compiled prompts without generating
logo-studio sweep Generate same prompt across multiple seeds for comparison
logo-studio validate Check if a visual brief will produce good results
logo-studio backends Inspect and manage generation backends
Generate options:
--name Product name (required)
--description Product description (or use --file)
--file Read description from a text file
--backend together | comfyui | diffusers (default: together)
--variants How many images to generate (default: 4)
--style Override style (minimal, geometric, organic, brutalist, etc.)
--colors Override palette (comma-separated hex)
--output Output directory (default: ./logos)
--svg Also trace SVG vectors from generated PNGs
--seed Fixed seed for reproducible output
--checkpoint Checkpoint name (enables checkpoint-specific negative tuning)
--model Together AI model ID or checkpoint override
--sampler ComfyUI sampler (euler, dpmpp_2m, etc.)
--scheduler ComfyUI scheduler (simple, karras, etc.)
--quantize Diffusers quantization: fp8 | nf4 | none
--width Image width in pixels (default: 1024)
--height Image height in pixels (default: 1024)
--cfg CFG scale override
--verbose Show detailed prompt assembly
Sweep options:
--prompt SDXL prompt to test (required)
--seeds Comma-separated seed list (default: 42,123,777,2024)
--backend together | comfyui | diffusers (default: together)
--output Output directory (default: ./sweep)
Backends subcommands:
backends ping Check if ComfyUI is reachable
backends list-checkpoints List available checkpoints, samplers, schedulers
Governor options:
--governor Enable resource control
--budget Token budget for the session (default: 200)
--max-resolution Max image size in pixels (default: 1024)
--max-spend Together AI spend cap in USD (default: 5.00)
--on-budget What to do when budget runs out: throw | warn | scaleProgrammatic API
Everything the CLI can do is available as typed functions:
// Mining
import { mine, getConceptCount } from '@mcptoolshop/logo-studio';
// Compilation
import { compile, validateBrief, getNegativesForCheckpoint } from '@mcptoolshop/logo-studio';
import type { CompileOptions, BriefValidation } from '@mcptoolshop/logo-studio';
// Generation
import { generate, createBackend } from '@mcptoolshop/logo-studio';
import type { BackendCapabilities } from '@mcptoolshop/logo-studio';
// Backend utilities
import { withRetry, isTransientError, ComfyUIError, ping, queryServerInfo } from '@mcptoolshop/logo-studio';
// Tokenizer (standalone)
import { getTokenizer, countTokens, checkBudget, detectCollisions } from '@mcptoolshop/logo-studio';
// Governor
import { createGovernor, Governor } from '@mcptoolshop/logo-studio';
// Post-processing
import { trace, verifyPalette } from '@mcptoolshop/logo-studio';Architecture
src/
├── mining/ LLM-driven semantic extraction + metaphor mapping
├── compiler/ Token-budgeted prompt assembly + canonical phrases
├── tokenizer/ Pure TypeScript CLIP BPE tokenizer + T5 estimator
├── backends/ ComfyUI, Together AI, diffusers adapters
├── governor/ Budget control, VRAM throttling, circuit breaker
├── post/ SVG tracing + palette verification
├── cli/ Zero-dependency CLI with argument parser
└── types/ Shared interfaces30 source files. Zero runtime dependencies. One npm install.
Why zero dependencies?
Every dependency is a supply chain risk, a version conflict waiting to happen, and a reason someone's npm install breaks on a Tuesday. Logo Studio ships everything it needs:
- CLIP tokenizer — full BPE implementation with the official merges table
- T5 estimator — statistical token counting without the 2GB model
- Argument parser — hand-rolled, no commander/yargs/minimist
- HTTP clients — native
fetch(Node 18+) - GPU metrics —
nvidia-smisubprocess, graceful fallback on non-NVIDIA
Prompt engineering handbook
The repo includes a prompt engineering handbook documenting:
- How SDXL's dual CLIP encoder system works
- The CLIP visual strength hierarchy (physical objects → abstract concepts)
- Token budget rules and what eats tokens unexpectedly
- Logo prompt formulas with worked examples
- Negative prompt strategy for logos specifically
- CFG scale and sampler settings for different styles
- Known limitations (text rendering, color accuracy, composition)
It's the reference document behind every decision the compiler makes.
Security & Data Scope
| Aspect | Detail | |--------|--------| | Data touched | Product descriptions (input), generated images (output), tokenizer data (bundled) | | Data NOT touched | No user credentials, no databases, no analytics | | Permissions | Read: input text/config. Write: output images to user-specified paths | | Network | Optional — ComfyUI or Together API when configured. None by default. | | Telemetry | None collected or sent |
See SECURITY.md for vulnerability reporting.
Scorecard
| Category | Score | |----------|-------| | A. Security | 10 | | B. Error Handling | 10 | | C. Operator Docs | 10 | | D. Shipping Hygiene | 10 | | E. Identity (soft) | 10 | | Overall | 50/50 |
Full audit: SHIP_GATE.md · SCORECARD.md
License
MIT
