flex-md
v4.7.1
Published
Parse and stringify FlexMD: semi-structured Markdown with three powerful layers - Frames, Output Format Spec (OFS), and Detection/Extraction.
Downloads
87
Maintainers
Readme
Flex-MD (v4.0) — Markdown Output Contract with Smart Token Estimation
Flex-MD is a TypeScript library for building and enforcing Markdown Output Contracts with LLMs. It treats Markdown as a semi-structured data format, allowing you to define required sections, list types, and tables while maintaining 100% standard Markdown compatibility.
What's New in v4.0:
- 🎯 Automatic Token Estimation: Calculate
max_tokensdirectly from your spec - 📏 System Parts Protocol: Standardized size hints that guide LLMs AND enable token prediction
- 🧠 Smart Toolbox: Cognitive cost analysis, confidence scoring, and improvement detection
- 🔧 Auto-Fix: Automatically improve specs with one command
- 🔄 Markdown to JSON: Convert sectioned MD to structured objects with camel-casing (v4.1)
Key Features
Core (v3.0)
- Standard Markdown: No proprietary tags. Pure headings, lists, and tables.
- Strictness Levels (L0–L3): From loose guidance to rigid structural enforcement.
- Deterministic Repair: Auto-fixes misformatted LLM output (merged fences, missing headings, format conversion).
- Instructions Output Format Guidance: Generate formal "Instructions Blocks" for LLM prompts directly from spec objects.
- Issues Envelope: A structured failure format for when repairs fail, allowing safe fallbacks.
Smart Features (v4.0)
- Token Estimation: Automatically calculate
max_tokensfor API calls based on your spec - System Parts: Structured instruction patterns (
Length: 2-3 paragraphs,Items: 3-5) that guide LLMs and enable estimation - Compliance Checking: Validate specs meet quality standards (L0-L3 compliance levels)
- Cognitive Cost Analysis: Measure how much effort your spec requires to write/maintain
- Confidence Scoring: Know how accurate your token estimates will be
- Improvement Detection: Find issues and get actionable suggestions
- Auto-Fix: Apply improvements automatically
- Markdown to JSON: Transform sectioned Markdown into structured JSON with camel-cased keys
Installation
npm install flex-mdQuick Start
1. Define your Output Format Spec (OFS) with System Parts
import { parseOutputFormatSpec, getMaxTokens } from 'flex-md';
const spec = parseOutputFormatSpec(`
## Output format
- Short answer — text (required)
Length: 1-2 sentences. Be concise and direct.
- Reasoning — ordered list (required)
Items: 3-5. Explain your logic step by step.
- Assumptions — list (optional)
Items: at least 2. List any key assumptions made.
empty sections:
- If a section is empty, write \`None\`.
`);
// Automatically estimate max_tokens needed
const maxTokens = getMaxTokens(spec);
console.log(`Estimated max_tokens: ${maxTokens}`); // ~6502. Use in Your LLM API Call
const response = await fetch('https://api.anthropic.com/v1/messages', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-api-key': API_KEY
},
body: JSON.stringify({
model: 'claude-sonnet-4-20250514',
max_tokens: maxTokens, // Automatically calculated!
messages: [{
role: 'user',
content: yourPrompt + '\n\n' + buildMarkdownGuidance(spec)
}]
})
});3. Enforce the Contract
import { enforceFlexMd } from 'flex-md';
const llmResponse = await response.json();
const result = enforceFlexMd(llmResponse.content[0].text, spec, { level: 2 });
if (result.ok) {
console.log(result.extracted.sectionsByName["Short answer"].md);
console.log(result.extracted.sectionsByName["Reasoning"].md);
} else {
console.log(result.outputText); // Issues Envelope
}System Parts Protocol
System Parts are structured prefixes in section instructions that serve dual purposes:
- Guide the LLM on expected output size
- Enable token estimation for
max_tokenscalculation
Syntax
[SYSTEM_PART]. [OPTIONAL_GUIDANCE]Examples:
// Text sections
"Length: 2-3 paragraphs. Provide detailed analysis."
"Length: brief. Keep it short."
// Lists
"Items: 3-5. Focus on key insights."
"Items: at least 3. Be comprehensive."
// Tables
"Rows: 5-7, Columns: 3. Include metrics."
// Code
"Lines: 20-30. Include error handling."
"Lines: ~50. Provide complete example."Allowed Values
| Section Type | System Part Pattern | Examples |
|--------------|-------------------|----------|
| text | Length: <value> | brief, moderate, detailed, extensive, 1-2 sentences, 2-3 paragraphs |
| list | Items: <value> | 3, 3-5, at least 3 |
| table | Rows: <value>, Columns: <value> | Rows: 5, Columns: 3, Rows: 3-5, Columns: 4 |
| code | Lines: <value> | 20, 15-25, ~50 |
See System Parts Guide for complete reference.
Compliance Levels (for Spec Authors)
Compliance levels measure how much detail you provide in system parts:
| Level | Detail | Cognitive Load | Token Estimation Accuracy | |-------|--------|---------------|--------------------------| | L0 | No system parts | None | Fallback (~±40%) | | L1 | Simple values | Minimal | Basic (~±30%) | | L2 | Ranges allowed | Low | Good (~±20%) | | L3 | Full spec with "at least", "~" | Medium | Precise (~±10%) |
L2 is recommended for most use cases - good balance of effort and accuracy.
Examples by Level
// L0 - No system parts (fallback estimation)
"Just provide a summary."
// L1 - Simple values
"Length: brief. Provide a summary."
"Items: 3. List the main points."
// L2 - Ranges
"Length: 2-3 paragraphs. Provide detailed analysis."
"Items: 3-5. List key insights."
// L3 - Full specification
"Items: at least 5. Include all relevant factors."
"Lines: ~50. Provide a complete working example."Input vs Output Format Specs
Flex-MD supports both Input Format Specs and Output Format Specs, but they serve different purposes:
Input Format Specs (Planning & Design)
Input Format Specs (defined with ## Input format) are design-time tools for planning and documentation:
- ✅ Documentation: Clearly specify what input structure your LLM expects
- ✅ Planning: Help design your prompt structure and data flow
- ✅ Coding: Guide developers on how to structure inputs
- ✅ Validation: Can be used to validate input data (optional)
Input format specs are NOT used for runtime token estimation — they're for human understanding and system design.
Output Format Specs (Runtime)
Output Format Specs (defined with ## Output format) are runtime tools:
- ✅ Token Estimation: Used to calculate
max_tokensfor API calls - ✅ Validation: Enforce structure on LLM responses
- ✅ Extraction: Parse structured data from responses
- ✅ Guidance: Tell the LLM exactly what format to return
Output format specs ARE used for runtime token estimation — they directly impact API call parameters.
Example: Using Both
const instructions = `
## Input format
- User Query — text (required)
The question or request from the user.
- Context — list (optional)
Items: 1-5. Additional context items.
## Output format
- Answer — text (required)
Length: 2-3 paragraphs. Provide a comprehensive answer.
- Sources — list (required)
Items: 3-7. List information sources used.
`;
// Parse both formats
import { parseFormatSpecs } from 'flex-md';
const { input, output } = parseFormatSpecs(instructions);
// Input format: Use for documentation/planning
console.log('Expected input structure:', input);
// Output format: Use for runtime token estimation
const maxTokens = getMaxTokens(output);Smart Toolbox
Token Estimation
Planning-Time Estimation
For estimating tokens from format specs directly (useful during development):
import { getMaxTokens, estimateSpecTokens } from 'flex-md';
// Quick estimate from output format spec
const maxTokens = getMaxTokens(spec);
// Detailed estimate with options
const estimate = estimateSpecTokens(spec, {
includeOptional: true,
safetyMultiplier: 1.3,
strategy: 'average' // 'conservative' | 'average' | 'generous'
});
console.log(estimate);
// {
// total: { estimated: 650, min: 520, max: 780, confidence: 'high' },
// bySectionName: { ... },
// overhead: 60
// }Runtime Estimation
For estimating tokens at runtime with actual prompt, context, and instructions:
import { runtimeEstimateTokens } from 'flex-md';
const estimate = runtimeEstimateTokens({
prompt: 'Analyze this code and explain what it does.',
context: 'Previous conversation about TypeScript...',
instructions: `
You are a helpful coding assistant.
## Output format
- Analysis — text (required)
Length: 2-3 paragraphs. Explain the code functionality.
- Code Quality — list (required)
Items: 3-5. List quality observations.
`,
options: {
safetyMultiplier: 1.2,
strategy: 'average',
additionalOverhead: 50 // For system messages, formatting, etc.
}
});
console.log(estimate.maxTokens); // Recommended max_tokens for API call (output tokens only)
console.log(estimate.breakdown);
// {
// prompt: 12, // Input tokens (for budgeting)
// context: 8, // Input tokens (for budgeting)
// instructions: 45, // Input tokens (for budgeting)
// output: { total: { estimated: 450, ... }, ... }, // Output token estimate
// additionalOverhead: 50,
// total: 565 // Total tokens (input + output) for budgeting
// }
// Use in API call
const response = await fetch('https://api.anthropic.com/v1/messages', {
method: 'POST',
body: JSON.stringify({
model: 'claude-sonnet-4-20250514',
max_tokens: estimate.maxTokens, // Output tokens only (extracted from output format spec)
messages: [{
role: 'user',
content: prompt + '\n\n' + instructions
}]
})
});The runtimeEstimateTokens function:
- Extracts output format specs from instructions automatically (input format specs are ignored)
- Estimates tokens for prompt, context, and instructions text (for budgeting/planning)
- Estimates output tokens based on the output format spec (this becomes
max_tokens) - Provides a breakdown showing input tokens (for budgeting) and output tokens (for API parameter)
- Returns
maxTokenswhich is the value to use for themax_tokensAPI parameter (output only)
Compliance Checking
import { checkCompliance, formatComplianceReport } from 'flex-md';
const report = checkCompliance(spec, 2); // Check if meets L2
console.log(formatComplianceReport(report));
// Shows which sections need improvement to meet target levelConfidence Scoring
import { calculateConfidence } from 'flex-md';
const confidence = calculateConfidence(spec);
console.log(`Confidence: ${confidence.grade} (${confidence.overall}%)`);
console.log('Recommendations:', confidence.recommendations);
// Grade: B (82%)
// Recommendations: ["Good confidence, but can be improved", ...]Cognitive Cost Analysis
import { calculateCognitiveCost } from 'flex-md';
const cost = calculateCognitiveCost(spec);
console.log(`Cost: ${cost.totalCost}/100`);
console.log(`Assessment: ${cost.recommendation}`);
// Cost: 28/100
// Assessment: "Moderate cognitive load - reasonable effort required"Improvement Detection
import { detectImprovements, formatImprovementReport, autoFix } from 'flex-md';
// Detect issues and opportunities
const analysis = detectImprovements(spec, 2);
console.log(formatImprovementReport(analysis));
// Auto-fix quick wins
const fixResult = autoFix(spec, analysis.improvements, {
applyQuickWinsOnly: true
});
console.log(fixResult.summary);
// "Applied 4 fixes, skipped 1"Complete Smart Analysis
import { analyzeSpec, formatSmartReport } from 'flex-md';
const analysis = analyzeSpec(spec, 2);
console.log(formatSmartReport(analysis));Output:
╔═══════════════════════════════════════════════════╗
║ FLEX-MD SMART ANALYSIS REPORT ║
╚═══════════════════════════════════════════════════╝
📊 SUMMARY DASHBOARD
──────────────────────────────────────────────────
Compliance: ✓ L2 PASS
Confidence: B (82%)
Cognitive Cost: 28/100
Token Estimate: 650 tokens
💡 RECOMMENDATIONS
──────────────────────────────────────────────────
🟢 Low Priority:
• Good confidence, but can be improved
→ Upgrade a few sections to L3 for better precision
...Strictness Levels (for LLM Output Enforcement)
| Level | Goal | Guidance | Enforcement | | :--- | :--- | :--- | :--- | | L0 | Plain Markdown | "Reply in Markdown." | None. Accept as-is. | | L1 | Sectioned MD | "Include these headings..." | Headings must exist. | | L2 | Fenced Container | "Return inside a single block..." | Exactly one fenced block. | | L3 | Typed Structure | "Reasoning is an ordered list..." | Enforce list/table kinds. |
Note: These are different from Compliance Levels (which measure spec quality) - Strictness Levels control how strictly Flex-MD enforces the contract on LLM output.
The Repair Pipeline
Flex-MD doesn't just validate; it repairs. Our deterministic 9-step plan handles:
- Container Normalization: Wrapping or merging multiple fenced blocks.
- Heading Standardization: Case-insensitive matching and naming cleanup.
- Missing Headings: Adding required sections as
None. - Stray Content: Moving text outside headings into a default section.
- Format Conversion: Transforming bullets to numbered lists (and vice-versa) based on spec.
Real-World Example
import {
parseOutputFormatSpec,
getMaxTokens,
analyzeSpec,
enforceFlexMd
} from 'flex-md';
// 1. Define spec with system parts
const spec = parseOutputFormatSpec(`
## Output format
- Executive Summary — text (required)
Length: 2-3 paragraphs. Summarize findings and recommendations.
- Key Metrics — table (required)
Rows: 5-7, Columns: 3. Include: Metric, Current, Target.
- Action Items — ordered list (required)
Items: 5-10. Prioritize by impact.
- Technical Details — code (optional)
Lines: 20-30. Include implementation examples.
`);
// 2. Analyze spec quality
const analysis = analyzeSpec(spec, 2);
console.log(`Confidence: ${analysis.confidence.grade}`);
console.log(`Max tokens: ${analysis.tokenEstimate.total.estimated}`);
// 3. Use in API call
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: getMaxTokens(spec, { safetyMultiplier: 1.3 }),
messages: [{
role: 'user',
content: `Analyze Q4 performance.\n\n${buildMarkdownGuidance(spec)}`
}]
});
// 4. Enforce and extract
const result = enforceFlexMd(response.content[0].text, spec, { level: 2 });
if (result.ok) {
const summary = result.extracted.sectionsByName["Executive Summary"].md;
const metrics = result.extracted.sectionsByName["Key Metrics"].md;
// Use structured output...
}Advanced Usage
Custom Token Estimation
const estimate = estimateSpecTokens(spec, {
includeOptional: false, // Skip optional sections
safetyMultiplier: 1.5, // Extra headroom
strategy: 'conservative' // Use minimum estimates
});CI/CD Integration
// validate-specs.ts
import { analyzeSpec } from 'flex-md';
const analysis = analyzeSpec(spec, 2);
const highPriorityIssues = analysis.recommendations
.filter(r => r.priority === 'high');
if (highPriorityIssues.length > 0) {
console.error('High priority issues found');
process.exit(1);
}Progressive Enhancement
Start simple and upgrade as needed:
// Version 1: No system parts (works, but fallback estimation)
const v1 = `
## Output format
- Summary — text (required)
Write a summary.
`;
// Version 2: Add L1 system parts (better)
const v2 = `
## Output format
- Summary — text (required)
Length: brief. Write a summary.
`;
// Version 3: Upgrade to L2 (best balance)
const v3 = `
## Output format
- Summary — text (required)
Length: 2-3 sentences. Write a summary.
`;API Reference
Core Functions
parseOutputFormatSpec(markdown)- Parse output format spec from markdownparseInputFormatSpec(markdown)- Parse input format spec from markdownparseFormatSpecs(instructions)- Extract both input and output format specs from instructionsstringifyOutputFormatSpec(spec)- Convert spec to markdownbuildMarkdownGuidance(spec, options)- Generate LLM instructionsenforceFlexMd(text, spec, options)- Validate and repair LLM output
Token Estimation (v4.0)
getMaxTokens(spec, options?)- Get estimated max_tokens from specgetMaxTokensFromInstructions(instructions, options?)- Extract format specs and estimate tokensestimateSpecTokens(spec, options?)- Detailed token estimate from specruntimeEstimateTokens(params)- Runtime estimation with prompt, context, and instructionsparseSystemPart(instruction, kind)- Parse system part from instructionestimateTokens(systemPart)- Estimate tokens for system part
Smart Toolbox (v4.0)
checkCompliance(spec, level)- Validate compliance levelcalculateConfidence(spec)- Score estimation confidencecalculateCognitiveCost(spec)- Measure spec complexitydetectImprovements(spec, level?)- Find issues and suggestionsautoFix(spec, improvements, options?)- Apply automatic fixesanalyzeSpec(spec, level?)- Complete smart analysis
Reporting (v4.0)
formatComplianceReport(report)- Format compliance checkformatImprovementReport(analysis)- Format improvementsformatSmartReport(analysis)- Format complete analysis
Markdown to JSON Transformation (v4.1)
Flex-MD includes a robust utility to convert sectioned Markdown (headings and their respective bodies) into a standard JavaScript object/JSON. This is particularly useful for extracting structured data from LLM responses without needing a complex schema.
Features:
- Automatic Camel-Casing: Headings like
### Short Answeror===Next Stepsare automatically converted to valid camelCase keys (shortAnswer,nextSteps). - Robust Newline Handling: Automatically handles both actual newlines and literal
\\nescape sequences often found in LLM outputs. - Support for All Heading Types: Works with standard
###headings and===keyalternative delimiters.
Usage:
import { markdownToJson } from 'flex-md';
const md = `
### Short Answer
The asset is a server named server1 with private IP 192.168.1.1.
### Next Steps
1. Document in CMDB
2. Perform security check
`;
const data = markdownToJson(md);
console.log(data.shortAnswer);
// "The asset is a server named server1 with private IP 192.168.1.1."
console.log(data.nextSteps);
// "1. Document in CMDB\\n2. Perform security check"NX Flex-MD: Schema-Driven Extraction (Advanced)
For more complex scenarios where you need to enforce a specific JSON schema, handle fuzzy matching of headings, or apply automatic fixes to the data, Flex-MD exposes the NX Flex-MD toolset (powered by nx-md-parser).
Feature Highlights:
- Schema Validation: Define the exact structure and types (
string,number,boolean,array,object) you expect. - Intelligent Heading Matching: Matches headings even if they aren't an exact match (e.g., "Summary" vs "Exec Summary").
- Automatic Type Conversion: Converts Markdown strings to numbers or booleans based on your schema.
- Auto-Fixing: Automatically corrects common formatting issues to match the schema.
Usage:
import { JSONTransformer, Schema } from 'flex-md';
// 1. Define a schema
const schema = Schema.object({
status: Schema.string(),
score: Schema.number(),
isVerified: Schema.boolean(),
tags: Schema.array(Schema.string())
});
// 2. Create the transformer
const transformer = new JSONTransformer(schema);
// 3. Transform complex Markdown
const md = `
### Status
Active and operational
### Score
95.5
### Verified
Yes
### Tags
- production
- critical
`;
const { result, status, errors } = transformer.transformMarkdown(md);
if (status === 'validated' || status === 'fixed') {
console.log(result);
/*
{
status: "Active and operational",
score: 95.5,
isVerified: true,
tags: ["production", "critical"]
}
*/
}Flex-MD allows you to use its native Output Format Spec (OFS) as the source of truth for NX-MD-Parser's structured extraction.
Modern "Smart" Transformation
The transformWithOfs function is now smarter. It performs dual parsing:
- Automatic Parsing: Even if you don't have a spec, it extracts all sections and camel-cases keys.
- Contract Enforcement: If you provide a spec, it uses
nx-md-parserto validate, type-cast (lists/tables), and repair the output.
Usage with LLM Outputs:
When working with LLMs, pass the entire response text directly. Flex-MD handles internal normalization (like escaped \n characters) automatically.
import { transformWithOfs } from 'flex-md';
// Pass the RAW content string from your LLM provider
const {
parsedOutput, // Always populated (auto-extraction)
contractOutput, // Populated if spec was provided
contractStatus, // "ok" | "different" | "skipped"
status // "validated" | "fixed" | "failed"
} = transformWithOfs(llmResponseText, spec);
console.log(parsedOutput.shortAnswer);Why use this?
- Zero-Config Extraction: Get structured data without writing a schema first.
- Dual-Safe: Compare what the LLM sent (
parsedOutput) with what the contract required (contractOutput). - Internal Normalization: Handles messy data (escaped newlines, merged code blocks) so you don't have to.
- Fuzzy Matching: Even if the LLM slightly changes the heading (e.g., "Summary" vs "Executive Summary"), the contract will correctly map it.
Advanced AI Features (via NX-MD-Parser 1.4.0)
Flex-MD utilizes the full power of nx-md-parser v1.4.0, providing enterprise-grade AI transformation capabilities.
🤖 Multi-Algorithm Fuzzy Matching
The engine uses a weighted combination of four powerful algorithms to find the best match for your headings and keys:
- Jaro-Winkler: Character-level similarity (40%)
- Jaccard Tokens: Token-based similarity (30%)
- Dice Coefficient: N-gram similarity (20%)
- Levenshtein Ratio: Edit distance (10%)
🧠 Machine Learning (Learn Aliases)
You can let the system learn from your data to improve matching over time.
import { learnAliasesFromTransformations } from 'flex-md';
const learningResult = learnAliasesFromTransformations([
{
input: { "Projct Name": "Test" },
output: { title: "Test" },
schema: yourSchema
}
]);
// System now knows "Projct Name" is an alias for "title"⚙️ Intelligent Auto-Fixing
- Typo Correction: Automatically fixes property name typos.
- Structural Repair: Restructures flat objects into nested schemas.
- Smart Conversion: Automatically handles
string -> number,string -> boolean, and wrapper types.
Spec Memory: Remember & Recall
Flex-MD includes an in-memory storage feature that allows you to "remember" an Output Format Spec and later reuse it by a unique recallId. This is especially useful for maintaining state within a single execution environment.
Usage:
import { remember, transformWithOfs } from 'flex-md';
// 1. Remember a spec extracted from instructions
const instructions = `
## Output format
- Confidence — number (required)
- Reason — text (required)
`;
const recallId = remember(instructions); // Returns "unique-uuid-..."
// 2. Later, use the recallId instead of the spec object
const md = `
### Confidence
0.95
### Reason
Everything looks correctly formatted based on initial evidence.
`;
const { contractOutput } = transformWithOfs(md, recallId);
console.log(contractOutput.Confidence); // 0.95Why use this?
- Cleaner Code: Passes simple strings instead of complex spec objects.
- Reference-based workflow: Useful when multiple parts of your system need to agree on the same specification.
- Efficiency: The spec is parsed once and reused.
Documentation
Detailed guides can be found in the docs folder:
- System Parts Guide - Complete protocol reference
- Token Estimation Guide - How estimation works
- Smart Toolbox Guide - Using analysis features
- MDFlex Compliance Spec - Output enforcement
Output Format Spec (OFS) Syntax Reference
Flex-MD uses a simple Markdown block to define the expected output contract. This block is both machine-readable (for validation) and human-readable (for the LLM).
1. Basic Structure
Start with ## Output format. List sections using - Name — Kind.
## Output format
- Summary — text (required)
- Reasoning — ordered list (required)
- Key Tags — list (optional)2. Available Kinds
| Kind | Description | Validation Rule |
| :--- | :--- | :--- |
| text (or prose) | Any text content. | Always matches if section exists. |
| list | Unordered bullets. | Body must contain - items. |
| ordered list | Numbered list. | Body must contain 1. items. |
| table | Markdown table. | Must match column headers. |
3. Tables
Define tables by specifying columns in parentheses.
Tables:
- (Name, Age, Role — table)
- (Rank, Team, Score — ordered table)- table: Standard Markdown pipe table.
- ordered table: Must have a first column
#with row numbers1..N.
4. Instructions & Constraints
You can add instructions under any section item to guide the LLM.
- Summary — text
Length: 2-3 sentences. No bullet points.5. Empty Sections
Define what to write if a section has no content:
Empty sections:
- If a section is empty, write `None`.6. 📝 Complete Example
Here is a full real-world example from a security analysis tool:
## Output format
- Executive Summary — text (required)
Brief overview of findings.
- Critical Vulnerabilities — ordered list (required)
- Remediation Plan — text (required)
Tables:
- (Component, Severity, Fix — table)
Empty sections:
- If a section is empty, write `None`.Migration from v3.0
v4.0 is 100% backwards compatible with v3.0. All existing code continues to work.
To adopt v4.0 features:
Add system parts to your spec instructions:
- Summary — text (required) - Provide a brief overview. + Summary — text (required) + Length: 2-3 sentences. Provide a brief overview.Use token estimation:
const maxTokens = getMaxTokens(spec);Analyze and improve your specs:
const analysis = analyzeSpec(spec); console.log(formatSmartReport(analysis));
Why Flex-MD v4.0?
Before v4.0
// Guessing max_tokens
const response = await api.create({
max_tokens: 2000, // 🤷 Is this enough? Too much?
...
});After v4.0
// Precise estimation
const maxTokens = getMaxTokens(spec); // ✓ 650 tokens (±20%)
const response = await api.create({
max_tokens, // 🎯 Right-sized
...
});Benefits
- ⚡ Faster responses: Right-sized tokens mean lower latency
- 💰 Lower costs: Don't overpay for unused tokens
- 🎯 Better accuracy: Clear size expectations guide LLMs
- 🔍 Quality insights: Know your spec's strengths/weaknesses
- 🛠️ Easy maintenance: Auto-detect and fix issues
License
MIT
Flex-MD v4.0 - Smart Markdown contracts for production LLM applications.
