@314owen/prompty
v1.0.4
Published
Typed, validated prompting against Cloudflare Workers AI
Downloads
528
Readme
prompty
Typed, validated prompting against the Cloudflare Workers AI REST API.
- Input/output validation via Zod — bad LLM responses are reprompted automatically
- Retry loop — failed schema parses append the error to the conversation and retry
- KV caching — pass any
KVNamespace-compatible binding; cache keys are SHA-256 of(model, system, input) - Personas — shared defaults (model, system prompt, retries) composed with per-prompt overrides
- Zero runtime deps — only
zod; uses nativefetchand Web Crypto
Setup
npm install promptyCopy the environment template and fill in your Cloudflare credentials:
cp .env.example .env| Variable | Where to find it |
|---|---|
| CLOUDFLARE_ACCOUNT_ID | Cloudflare dashboard → right sidebar |
| CLOUDFLARE_API_KEY | dash.cloudflare.com/profile/api-tokens → create token with Workers AI permission |
Quick start
import { definePrompt } from 'prompty';
import { z } from 'zod';
const config = {
accountId: process.env.CLOUDFLARE_ACCOUNT_ID!,
apiKey: process.env.CLOUDFLARE_API_KEY!,
};
const classify = definePrompt({
system: 'Classify the sentiment of the input text.',
input: z.object({ text: z.string() }),
output: z.object({
sentiment: z.enum(['positive', 'negative', 'neutral']),
confidence: z.number().min(0).max(1),
}),
}, config);
const result = await classify.run({ text: 'The product is surprisingly good!' });
// → { sentiment: 'positive', confidence: 0.95 }Personas
Share defaults across multiple prompts:
import { createPersona } from 'prompty';
const analyst = createPersona({
system: 'You are a financial analyst. Be precise and cite figures.',
model: '@cf/meta/llama-3.3-70b-instruct-fp8-fast',
maxRetries: 5,
}, config);
const summarise = analyst({
system: 'Summarise the following earnings report.',
input: z.object({ report: z.string() }),
output: z.object({
summary: z.string(),
keyFigures: z.array(z.string()),
}),
});
const result = await summarise.run({ report: '...' });Task-level values always override persona defaults. System prompts concatenate as persona.system + '\n\n' + task.system.
Caching
Pass any object matching the KVNamespace interface — a Cloudflare KV binding in Workers, or an in-memory store in tests:
const prompt = definePrompt({
system: 'Answer questions.',
input: z.object({ q: z.string() }),
output: z.object({ answer: z.string() }),
cache: {
enabled: true,
kv: env.MY_KV, // Cloudflare KV binding
ttlSeconds: 3600, // optional; omit for no expiry
},
}, config);
// Skip cache for a specific call:
await prompt.run({ q: '...' }, { skipCache: true });Defaults
| Setting | Default | Description |
|---|---|---|
| model | @cf/meta/llama-3.1-8b-instruct | LLM model to use |
| maxRetries | 3 | Max retry attempts on validation failure |
| maxContextTokens | 8000 | Max cumulative message history size (characters) |
| maxRequestSizeBytes | 51200 | Max input size before API call (50KB) |
| requestTimeoutMs | 30000 | Request timeout in milliseconds |
| cache.enabled | false | Enable KV caching |
Safety Features
Message History Limits
To prevent unbounded conversation growth during retries, a maxContextTokens limit (default 8000) caps the conversation history size. If retries exceed this limit, the prompt throws a max_retries error with details on the last attempt.
const prompt = definePrompt({
system: 'Answer questions.',
input: z.object({ q: z.string() }),
output: z.object({ answer: z.string() }),
maxContextTokens: 12000, // Increase for complex schemas
maxRetries: 5,
}, config);Input Size Validation
Input is validated against maxRequestSizeBytes (default 50KB) before the API call to prevent oversized requests:
const prompt = definePrompt({
maxRequestSizeBytes: 100000, // 100KB limit
// ...
}, config);Request Timeout
API requests have a default timeout of 30 seconds. Customize it per prompt:
const prompt = definePrompt({
requestTimeoutMs: 60000, // 60 seconds
// ...
}, config);Schema Constraints in Prompts
Array size constraints are communicated to the LLM in the system prompt:
const prompt = definePrompt({
system: 'Generate a list.',
input: z.object({ topic: z.string() }),
output: z.object({
items: z.array(z.string()).min(3).max(10),
}),
}, config);
// LLM sees in system prompt:
// "items": array of string (at least 3 items, at most 10 items)Cache Validation
Cached values are re-validated against the output schema before returning. This ensures stale cache data (e.g., after schema changes) isn't returned without validation:
// If the schema changes but cache still has old format,
// the old data is rejected and a fresh API call is made
const prompt = definePrompt({
cache: { enabled: true, kv: env.MY_KV },
output: z.object({ updated_field: z.string() }),
}, config);Cache Key Canonicalization
Cache keys are generated from canonicalized JSON, so semantically identical inputs with different key ordering hash to the same key:
// Both inputs below cache to the same key
await prompt.run({ topic: 'dogs', count: 5 });
await prompt.run({ count: 5, topic: 'dogs' });Error Handling
All errors are instances of PromptyError with a kind field for discrimination:
import { PromptyError } from 'prompty';
try {
await prompt.run(input);
} catch (err) {
if (err instanceof PromptyError) {
if (err.kind === 'validation') {
// Input validation failed — check err.message for field details
console.error('Bad input:', err.message);
} else if (err.kind === 'api') {
// Cloudflare Workers AI API error
console.error('API error:', err.message);
} else if (err.kind === 'max_retries') {
// Exhausted retries (validation or context limit)
console.error('Max retries exceeded:', err.message);
}
}
}Input validation errors include field-level details:
Input validation failed: "user.email": Invalid email; "user.age": Expected numberRunning tests
# Unit tests (no credentials needed)
npm test
# Integration tests (requires .env with real credentials)
npm run test:integration
# Watch mode
npm run test:watch