@314owen/prompty

v1.0.4

Published

a day ago

Typed, validated prompting against Cloudflare Workers AI

Downloads

528

0High
0Medium
0Low

314owen

prompty

Typed, validated prompting against the Cloudflare Workers AI REST API.

Input/output validation via Zod — bad LLM responses are reprompted automatically
Retry loop — failed schema parses append the error to the conversation and retry
KV caching — pass any KVNamespace-compatible binding; cache keys are SHA-256 of (model, system, input)
Personas — shared defaults (model, system prompt, retries) composed with per-prompt overrides
Zero runtime deps — only zod; uses native fetch and Web Crypto

Setup

npm install prompty

Copy the environment template and fill in your Cloudflare credentials:

cp .env.example .env

| Variable | Where to find it | |---|---| | CLOUDFLARE_ACCOUNT_ID | Cloudflare dashboard → right sidebar | | CLOUDFLARE_API_KEY | dash.cloudflare.com/profile/api-tokens → create token with Workers AI permission |

Quick start

import { definePrompt } from 'prompty';
import { z } from 'zod';

const config = {
  accountId: process.env.CLOUDFLARE_ACCOUNT_ID!,
  apiKey: process.env.CLOUDFLARE_API_KEY!,
};

const classify = definePrompt({
  system: 'Classify the sentiment of the input text.',
  input: z.object({ text: z.string() }),
  output: z.object({
    sentiment: z.enum(['positive', 'negative', 'neutral']),
    confidence: z.number().min(0).max(1),
  }),
}, config);

const result = await classify.run({ text: 'The product is surprisingly good!' });
// → { sentiment: 'positive', confidence: 0.95 }

Personas

Share defaults across multiple prompts:

import { createPersona } from 'prompty';

const analyst = createPersona({
  system: 'You are a financial analyst. Be precise and cite figures.',
  model: '@cf/meta/llama-3.3-70b-instruct-fp8-fast',
  maxRetries: 5,
}, config);

const summarise = analyst({
  system: 'Summarise the following earnings report.',
  input: z.object({ report: z.string() }),
  output: z.object({
    summary: z.string(),
    keyFigures: z.array(z.string()),
  }),
});

const result = await summarise.run({ report: '...' });

Task-level values always override persona defaults. System prompts concatenate as persona.system + '\n\n' + task.system.

Caching

Pass any object matching the KVNamespace interface — a Cloudflare KV binding in Workers, or an in-memory store in tests:

const prompt = definePrompt({
  system: 'Answer questions.',
  input: z.object({ q: z.string() }),
  output: z.object({ answer: z.string() }),
  cache: {
    enabled: true,
    kv: env.MY_KV,        // Cloudflare KV binding
    ttlSeconds: 3600,     // optional; omit for no expiry
  },
}, config);

// Skip cache for a specific call:
await prompt.run({ q: '...' }, { skipCache: true });

Defaults

| Setting | Default | Description | |---|---|---| | model | @cf/meta/llama-3.1-8b-instruct | LLM model to use | | maxRetries | 3 | Max retry attempts on validation failure | | maxContextTokens | 8000 | Max cumulative message history size (characters) | | maxRequestSizeBytes | 51200 | Max input size before API call (50KB) | | requestTimeoutMs | 30000 | Request timeout in milliseconds | | cache.enabled | false | Enable KV caching |

Safety Features

Message History Limits

To prevent unbounded conversation growth during retries, a maxContextTokens limit (default 8000) caps the conversation history size. If retries exceed this limit, the prompt throws a max_retries error with details on the last attempt.

const prompt = definePrompt({
  system: 'Answer questions.',
  input: z.object({ q: z.string() }),
  output: z.object({ answer: z.string() }),
  maxContextTokens: 12000,  // Increase for complex schemas
  maxRetries: 5,
}, config);

Input Size Validation

Input is validated against maxRequestSizeBytes (default 50KB) before the API call to prevent oversized requests:

const prompt = definePrompt({
  maxRequestSizeBytes: 100000,  // 100KB limit
  // ...
}, config);

Request Timeout

API requests have a default timeout of 30 seconds. Customize it per prompt:

const prompt = definePrompt({
  requestTimeoutMs: 60000,  // 60 seconds
  // ...
}, config);

Schema Constraints in Prompts

Array size constraints are communicated to the LLM in the system prompt:

const prompt = definePrompt({
  system: 'Generate a list.',
  input: z.object({ topic: z.string() }),
  output: z.object({
    items: z.array(z.string()).min(3).max(10),
  }),
}, config);

// LLM sees in system prompt:
// "items": array of string (at least 3 items, at most 10 items)

Cache Validation

Cached values are re-validated against the output schema before returning. This ensures stale cache data (e.g., after schema changes) isn't returned without validation:

// If the schema changes but cache still has old format, 
// the old data is rejected and a fresh API call is made
const prompt = definePrompt({
  cache: { enabled: true, kv: env.MY_KV },
  output: z.object({ updated_field: z.string() }),
}, config);

Cache Key Canonicalization

Cache keys are generated from canonicalized JSON, so semantically identical inputs with different key ordering hash to the same key:

// Both inputs below cache to the same key
await prompt.run({ topic: 'dogs', count: 5 });
await prompt.run({ count: 5, topic: 'dogs' });

Error Handling

All errors are instances of PromptyError with a kind field for discrimination:

import { PromptyError } from 'prompty';

try {
  await prompt.run(input);
} catch (err) {
  if (err instanceof PromptyError) {
    if (err.kind === 'validation') {
      // Input validation failed — check err.message for field details
      console.error('Bad input:', err.message);
    } else if (err.kind === 'api') {
      // Cloudflare Workers AI API error
      console.error('API error:', err.message);
    } else if (err.kind === 'max_retries') {
      // Exhausted retries (validation or context limit)
      console.error('Max retries exceeded:', err.message);
    }
  }
}

Input validation errors include field-level details:

Input validation failed: "user.email": Invalid email; "user.age": Expected number

Running tests

# Unit tests (no credentials needed)
npm test

# Integration tests (requires .env with real credentials)
npm run test:integration

# Watch mode
npm run test:watch