vague-lang
v3.3.0
Published
A declarative language for generating realistic test data
Maintainers
Readme

Vague
A declarative language for describing and generating realistic data. Vague treats ambiguity as a first-class primitive — declare the shape of valid data and let the runtime figure out how to populate it.
Why Vague?
Vague is a data description model for APIs, not just a fake data tool.
Think of it as OpenAPI meets property-based testing: you describe what valid data looks like — its structure, constraints, distributions, and edge cases — and Vague handles generation. The same schema that generates test data can validate production data.
| What You Need | Traditional Tools | Vague |
|---------------|-------------------|-------|
| Intent — "80% of users are active" | Random selection | status: 0.8: "active" \| 0.2: "inactive" |
| Constraints — "due date ≥ issued date" | Manual validation | assume due_date >= issued_date |
| Relationships — "payment references an invoice" | Manual wiring | invoice: any of invoices where .status == "open" |
| Edge cases — "test with Unicode exploits" | Manual creation | name: issuer.homoglyph("admin") |
| Validation — "does this data match the schema?" | Separate tool | Same .vague file with --validate-data |
The question isn't "which fake data library?" — it's "how do we formally describe what valid data looks like for our APIs?"
For a detailed comparison, see COMPARISON.md.
Installation
npm install vague-langOr install globally for CLI usage:
npm install -g vague-langQuick Start
Create a .vague file:
schema Customer {
name: string,
status: 0.8: "active" | 0.2: "inactive"
}
schema Invoice {
customer: any of customers,
amount: decimal in 100..10000,
status: "draft" | "sent" | "paid",
assume amount > 0
}
dataset TestData {
customers: 50 of Customer,
invoices: 200 of Invoice
}Generate JSON:
node dist/cli.js your-file.vagueSyntax Cheat Sheet
For a quick reference of all syntax, see SYNTAX.md.
Language Features
Superposition (Random Choice)
// Equal probability
status: "draft" | "sent" | "paid"
// Weighted probability
status: 0.6: "paid" | 0.3: "pending" | 0.1: "draft"
// Mixed: unweighted options share remaining probability
status: 0.85: "Active" | "Archived" // "Archived" gets 15%
category: 0.6: "main" | "side" | "dessert" // "side" and "dessert" get 20% eachRanges
age: int in 18..65
price: decimal in 0.01..999.99
founded: date in 2000..2023
// Decimal with explicit precision
score: decimal(1) in 0..10 // 1 decimal place
amount: decimal(2) in 10..100 // 2 decimal placesCollections
line_items: 1..5 of LineItem // 1-5 items
employees: 100 of Employee // Exactly 100Constraints
schema Invoice {
issued_date: int in 1..28,
due_date: int in 1..90,
status: "draft" | "paid",
amount: int in 0..10000,
// Hard constraint
assume due_date >= issued_date,
// Conditional constraint
assume if status == "paid" {
amount == 0
}
}Logical operators: and, or, not
Cross-Record References
schema Invoice {
// Reference any customer from the collection
customer: any of customers,
// Filtered reference
active_customer: any of customers where .status == "active"
}Parent References
schema LineItem {
// Inherit currency from parent invoice
currency: ^base_currency
}
schema Invoice {
base_currency: "USD" | "GBP" | "EUR",
line_items: 1..5 of LineItem
}Computed Fields
schema Invoice {
line_items: 1..10 of LineItem,
total: sum(line_items.amount),
item_count: count(line_items),
avg_price: avg(line_items.unit_price),
min_price: min(line_items.unit_price),
max_price: max(line_items.unit_price),
median_price: median(line_items.unit_price),
first_item: first(line_items.unit_price),
last_item: last(line_items.unit_price),
price_product: product(line_items.unit_price)
}Nullable Fields
nickname: string? // Shorthand: sometimes null
notes: string | null // ExplicitTernary Expressions
status: amount_paid >= total ? "paid" : "pending"
grade: score >= 90 ? "A" : score >= 70 ? "B" : "C"Match Expressions
// Pattern matching for multi-way branching
display: match status {
"pending" => "Awaiting shipment",
"shipped" => "On the way",
"delivered" => "Complete"
}
// Returns null if no pattern matchesConditional Fields
schema Account {
type: "personal" | "business",
companyNumber: string when type == "business" // Only exists for business accounts
}Dynamic Cardinality
schema Order {
size: "small" | "large",
items: (size == "large" ? 5..10 : 1..3) of LineItem
}Side Effects (then blocks)
schema Payment {
invoice: any of invoices,
amount: int in 10..500
} then {
invoice.amount_paid += amount,
invoice.status = invoice.amount_paid >= invoice.total ? "paid" : "partial"
}Unique Values
id: unique int in 1000..9999 // No duplicates in collectionPrivate Fields
schema Person {
age: private int in 0..105, // Generated but excluded from output
age_bracket: age < 18 ? "minor" : "adult" // Computed from private field
}
// Output: { "age_bracket": "adult" } -- no "age" fieldOrdered Sequences
pitch: [48, 52, 55, 60] // Cycles in order: 48, 52, 55, 60, 48...
color: ["red", "green", "blue"]Statistical Distributions
age: gaussian(35, 10, 18, 65) // mean, stddev, min, max
income: lognormal(10.5, 0.5) // mu, sigma
wait_time: exponential(0.5) // rate
daily_orders: poisson(5) // lambda
conversion: beta(2, 5) // alpha, betaDate Functions
created_at: now() // Full ISO 8601 timestamp
today_date: today() // Date only
past: daysAgo(30) // 30 days ago
future: daysFromNow(90) // 90 days from now
random: datetime(2020, 2024) // Random datetime in range
between: dateBetween("2023-01-01", "2023-12-31")Sequential Generation
id: sequence("INV-", 1001) // "INV-1001", "INV-1002", ...
order_num: sequenceInt("orders") // 1, 2, 3, ...
prev_value: previous("amount") // Reference previous recordString Transformations
// Case transformations
upper: uppercase(name) // "HELLO WORLD"
lower: lowercase(name) // "hello world"
capitalized: capitalize(name) // "Hello World"
// Case style conversions
slug: kebabCase(title) // "hello-world"
snake: snakeCase(title) // "hello_world"
camel: camelCase(title) // "helloWorld"
// String manipulation
trimmed: trim(" hello ") // "hello"
combined: concat(first, " ", last) // "John Doe"
part: substring(name, 0, 5) // First 5 characters
replaced: replace(name, "foo", "bar")
len: length(name) // String lengthNegative Testing
// Generate data that violates constraints (for testing error handling)
dataset Invalid violating {
bad_invoices: 100 of Invoice
}Examples
See the examples/ directory:
data-description-model/- Start here: Intent encoding, constraint encoding, edge-case biasbasics/- Core language features (schemas, constraints, computed fields, cross-refs)openapi-importing/- Import schemas from OpenAPI specsopenapi-examples-generation/- Populate OpenAPI specs with generated examplescodat/,stripe/,github/, etc. - Real-world API examples
CLI Usage
# Generate JSON to stdout
node dist/cli.js file.vague
# Save to file
node dist/cli.js file.vague -o output.json
# Pretty print
node dist/cli.js file.vague -p
# Reproducible output (seeded random)
node dist/cli.js file.vague --seed 123
# Watch mode - regenerate on file change
node dist/cli.js file.vague -o output.json -w
# CSV output
node dist/cli.js file.vague -f csv -o output.csv
# CSV with options
node dist/cli.js file.vague -f csv --csv-delimiter ";" -o output.csv
# Validate against OpenAPI spec
node dist/cli.js file.vague -v openapi.json -m '{"invoices": "Invoice"}'
# Validate only (exit code 1 on failure, useful for CI)
node dist/cli.js file.vague -v openapi.json -m '{"invoices": "Invoice"}' --validate-onlyOpenAPI Example Population
Generate realistic examples and embed them directly in your OpenAPI spec:
# Populate OpenAPI spec with inline examples
node dist/cli.js data.vague --oas-output api-with-examples.json --oas-source api.json
# Multiple examples per schema
node dist/cli.js data.vague --oas-output api.json --oas-source api.json --oas-example-count 3
# External file references instead of inline
node dist/cli.js data.vague --oas-output api.json --oas-source api.json --oas-externalAuto-detection maps collection names to schema names (e.g., invoices → Invoice).
CLI Options
| Option | Description |
|--------|-------------|
| -o, --output <file> | Write output to file |
| -f, --format <fmt> | Output format: json (default), csv |
| -p, --pretty | Pretty-print JSON |
| -s, --seed <number> | Seed for reproducible generation |
| -w, --watch | Watch input file and regenerate on changes |
| -v, --validate <spec> | Validate against OpenAPI spec |
| -m, --mapping <json> | Schema mapping {"collection": "SchemaName"} |
| --validate-only | Only validate, don't output data |
| --csv-delimiter <char> | CSV field delimiter (default: ,) |
| --csv-no-header | Omit CSV header row |
| --csv-arrays <mode> | Array handling: json, first, count |
| --csv-nested <mode> | Nested objects: flatten, json |
| --infer <file> | Infer Vague schema from JSON or CSV data |
| --collection-name <name> | Collection name for CSV inference |
| --infer-delimiter <char> | CSV delimiter for inference (default: ,) |
| --dataset-name <name> | Dataset name for inference |
| --oas-source <spec> | Source OpenAPI spec to populate with examples |
| --oas-output <file> | Output path for populated OpenAPI spec |
| --oas-example-count <n> | Number of examples per schema (default: 1) |
| --oas-external | Use external file references instead of inline |
| --plugins <dir> | Load plugins from directory (can be used multiple times) |
| --no-auto-plugins | Disable automatic plugin discovery |
| --verbose | Show verbose output (e.g., discovered plugins) |
| -h, --help | Show help |
Development
npm run build # Compile TypeScript
npm test # Run tests
npm run dev # Watch modeProject Structure
src/
├── lexer/ # Tokenizer
├── parser/ # Recursive descent parser
├── ast/ # AST node definitions
├── interpreter/ # JSON generator
├── validator/ # Schema validation (Ajv)
├── openapi/ # OpenAPI import support
├── infer/ # Schema inference from data
├── csv/ # CSV input/output formatting
├── config/ # Configuration file loading
├── logging/ # Debug logging utilities
├── plugins/ # Built-in plugins (faker, issuer, date, regex)
├── index.ts # Library exports
└── cli.ts # CLI entry pointRoadmap
See TODO.md for planned features:
- Probabilistic constraints (
assume X with probability 0.7) - Conditional schema variants
- Constraint solving (SMT integration)
Working with Claude
This project includes Claude Code skills that help Claude assist you more effectively when working with Vague files and OpenAPI specifications.
Available Skills
| Skill | Description |
|-------|-------------|
| vague | Writing Vague (.vague) files - syntax, constraints, cross-references |
| openapi | Working with OpenAPI specs - validation, schemas, best practices |
Installation via OpenSkills
Install the skills using OpenSkills:
npm i -g openskills
openskills install mcclowes/vagueThis installs the skills to your .claude/skills/ directory, making them available when you use Claude Code in this project.
Manual Installation
Alternatively, copy the skills directly:
git clone https://github.com/mcclowes/vague.git
cp -r vague/.claude/skills/* ~/.claude/skills/Contributing
See CONTRIBUTING.md for guidelines.
License
MIT License - see LICENSE
