@memberjunction/ai-prompts
v2.124.0
Published
MemberJunction: AI Prompt Execution and Management
Downloads
3,813
Keywords
Readme
@memberjunction/ai-prompts
Advanced AI prompt execution engine with hierarchical template composition, intelligent model selection, parallel execution, output validation, and comprehensive execution tracking.
Note on Parameters: This package uses the parameter types defined in
@memberjunction/ai. For a complete reference of available LLM parameters (temperature, topP, topK, etc.), see the Parameter Reference in the AI Core documentation.
Key Features
🎯 Effort Level Control
Granular control over AI model reasoning effort through a 1-100 integer scale. Higher values request more thorough reasoning and analysis from AI models that support effort levels.
Effort Level Hierarchy
The effort level is resolved using the following precedence (highest to lowest priority):
AIPromptParams.effortLevel- Runtime override (highest priority)AIPrompt.EffortLevel- Individual prompt setting (lower priority)- Provider default - Model's natural behavior (lowest priority)
Provider Support
Different AI providers map the 1-100 scale to their specific parameters:
- OpenAI: Maps to
reasoning_effort(1-33=low, 34-66=medium, 67-100=high) - Anthropic: Maps to thinking mode with token budgets (1-100 → 25K-2M tokens)
- Groq: Maps to experimental
reasoning_effortparameter - Gemini: Controls reasoning mode intensity
const params = new AIPromptParams();
params.prompt = myPrompt;
params.effortLevel = 85; // High effort for thorough analysis
const result = await AIPromptRunner.RunPrompt(params);🛡️ Model Selection & Intelligent Failover
The AI Prompts system provides sophisticated model selection with instant failover across models and vendors. Configure explicit model/vendor priorities using AIPromptModel records, and the system automatically tries all candidates in order when errors occur.
Selection Strategies
SelectionStrategy='Specific' (Recommended for production)
- Use explicit
AIPromptModelconfiguration for complete control - Configuration-specific models tried before universal fallbacks
- Priority determines order (higher number = tried first)
- Instant failover to next candidate on any error
SelectionStrategy='ByPower'
- Automatically selects models based on
PowerRank - Use
PowerPreference:Highest,Lowest, orBalanced
SelectionStrategy='Default'
- Uses model type filtering and power ranking
Model Ranking Algorithm (Specific Strategy)
When using SelectionStrategy='Specific', candidates are prioritized using clear, predictable rules:
sequenceDiagram
participant User as User Request
participant Engine as AIPromptRunner
participant DB as AIPromptModel Table
participant Exec as Execution
User->>Engine: Execute Prompt with ConfigurationID
Engine->>DB: Get AIPromptModel records
DB-->>Engine: Return all records for prompt
Note over Engine: Filter Phase
Engine->>Engine: Keep: ConfigurationID match OR NULL
Engine->>Engine: Exclude: Different ConfigurationID
Note over Engine: Sort Phase (2-level)
Engine->>Engine: 1. Config-match before Universal
Engine->>Engine: 2. Priority DESC within group
Note over Engine: Expand Phase
loop For each AIPromptModel
alt VendorID specified
Engine->>Engine: Create 1 candidate (Model+Vendor)
else VendorID is NULL
Engine->>Engine: Create N candidates (all vendors)
Engine->>Engine: Sort by AIModelVendor.Priority DESC
end
end
Engine->>Exec: Try candidates in order
loop Instant Failover
Exec->>Exec: Try Candidate N
alt Success
Exec-->>User: Return result
else Recoverable Error
Exec->>Exec: Try Candidate N+1 (instant)
else Fatal Error
Exec-->>User: Fail immediately
end
endConfiguration Rules
Priority Precedence:
- Configuration-specific models (matching
ConfigurationID) - Always tried first - Universal models (
ConfigurationID = NULL) - Fallback options - Within each group: Higher Priority number tried first
Configuration Filtering:
- If
ConfigurationIDprovided: Use matching config + universal (NULL) models - If NO
ConfigurationID: Use ONLY universal (NULL) models - Models with DIFFERENT
ConfigurationIDare EXCLUDED
Vendor Expansion:
AIPromptModel.VendorIDspecified → Single candidate (exact model+vendor)AIPromptModel.VendorID = NULL→ Multiple candidates (all vendors for that model, sorted byAIModelVendor.Priority DESC)
Example Configuration
-- Example: Production prompt with config-specific and universal fallbacks
INSERT INTO AIPromptModel (PromptID, ModelID, VendorID, ConfigurationID, Priority, Status) VALUES
-- Config-specific models (tried first, regardless of priority number)
(@promptId, @gpt4Id, @openaiId, @prodConfigId, 5, 'Active'),
(@promptId, @gpt4Id, @azureId, @prodConfigId, 3, 'Active'),
-- Universal fallbacks (tried after config-specific, despite higher priority numbers)
(@promptId, @claudeId, @anthropicId, NULL, 10, 'Active'),
(@promptId, @geminiId, @googleId, NULL, 8, 'Active');
-- Example: Multi-vendor support for same model
INSERT INTO AIPromptModel (PromptID, ModelID, VendorID, ConfigurationID, Priority, Status) VALUES
(@promptId, @gpt4Id, NULL, @prodConfigId, 10, 'Active');
-- VendorID=NULL expands to all vendors (OpenAI, Azure, Groq)
-- Vendors sorted by AIModelVendor.PriorityExecution order for above config with ConfigurationID=@prodConfigId:
- GPT-4/OpenAI (Config match, Priority 5)
- GPT-4/Azure (Config match, Priority 3)
- GPT-4/OpenAI (From VendorID=NULL expansion, highest AIModelVendor.Priority)
- GPT-4/Azure (From VendorID=NULL expansion)
- GPT-4/Groq (From VendorID=NULL expansion, lowest AIModelVendor.Priority)
- Claude/Anthropic (Universal fallback, Priority 10)
- Gemini/Google (Universal fallback, Priority 8)
Failover Behavior
Instant Failover - No delays between candidates
- Authentication errors → Filters out all candidates from failed vendor
- Fatal errors → Stops immediately
- Recoverable errors → Tries next candidate instantly
Validation Retry - After all candidates exhausted
- If all candidates fail, retries entire list with delays
- Uses
AIPrompt.MaxRetriesandRetryDelayMode(Fixed/Linear/Exponential)
Error Handling:
// SelectionStrategy='Specific' with no candidates throws error
if (strategy === 'Specific' && candidates.length === 0) {
throw new Error('Please configure AIPromptModel records for this prompt');
}🎯 Dynamic Hierarchical Template Composition
Why Dynamic Template Composition?
While MemberJunction's template system already supports static template composition (where Template A always includes Templates B and C), the AI Prompts system adds dynamic template composition - the ability to inject ANY prompt template into ANY other prompt template at runtime.
Static Composition (MJ Templates): Perfect for fixed relationships like email headers/footers
<!-- Email template always includes same header -->
{% include 'email-header' %}
{{ content }}
{% include 'email-footer' %}Dynamic Composition (AI Prompts): Essential for flexible runtime relationships
// Inject ANY child prompt into ANY parent prompt at runtime
const params = new AIPromptParams();
params.prompt = systemPrompt; // e.g., Agent Type's control flow prompt
params.childPrompts = [
new ChildPromptParam(agentPrompt, 'agentInstructions') // Specific agent's prompt
];
// System prompt can use {{ agentInstructions }} to embed the agent's specific logicThe Agent System Use Case
This dynamic composition is crucial for AI Agents:
- Agent Types have System Prompts that control execution flow and response format
- Individual Agents have their own specific prompts with domain logic
- At runtime, any agent's prompt is dynamically injected into its type's system prompt
- This creates a complete prompt combining the control wrapper with agent-specific instructions
// Agent Type System Prompt (controls flow)
const systemPrompt = {
templateText: `You are an AI agent. Follow these instructions:
{{ agentInstructions }} <!-- Dynamically injected at runtime -->
Respond in JSON format with: { decision: ..., reasoning: ... }`
};
// Individual Agent Prompt (domain logic)
const dataGatherAgent = {
templateText: `Your role is to gather data from: {{ dataSources }}`
};
// At runtime, compose them dynamically
params.childPrompts = [
new ChildPromptParam(dataGatherAgent, 'agentInstructions')
];🔄 System Placeholders
Automatically inject common values into all templates without manual data passing. Includes date/time, user context, prompt metadata, and more.
Current user: {{ _USER_NAME }}
Date: {{ _CURRENT_DATE }}
Expected output: {{ _OUTPUT_EXAMPLE }}System Placeholders Reference
System placeholders are automatically available in all AI prompt templates, providing dynamic values like current date/time, prompt metadata, and user context without requiring manual data passing.
Available System Placeholders
Date/Time Placeholders
{{ _CURRENT_DATE }}- Current date in YYYY-MM-DD format{{ _CURRENT_TIME }}- Current time in HH:MM AM/PM format with timezone{{ _CURRENT_DATE_AND_TIME }}- Full timestamp with date and time{{ _CURRENT_DAY_OF_WEEK }}- Current day name (e.g., Monday, Tuesday){{ _CURRENT_TIMEZONE }}- Current timezone identifier{{ _CURRENT_TIMESTAMP_UTC }}- Current UTC timestamp in ISO format
Prompt Metadata Placeholders
{{ _OUTPUT_EXAMPLE }}- The expected output example from the prompt configuration{{ _PROMPT_NAME }}- The name of the current prompt{{ _PROMPT_DESCRIPTION }}- The description of the current prompt{{ _EXPECTED_OUTPUT_TYPE }}- The expected output type (string, object, number, etc.){{ _RESPONSE_FORMAT }}- The expected response format from the prompt
User Context Placeholders
{{ _USER_NAME }}- Current user's full name{{ _USER_EMAIL }}- Current user's email address{{ _USER_ID }}- Current user's unique identifier
Environment Placeholders
{{ _ENVIRONMENT }}- Current environment (development, staging, production){{ _API_VERSION }}- Current API version
System Placeholder Usage Examples
Example 1: Time-Aware Agent Prompt
You are an AI assistant helping {{ _USER_NAME }} on {{ _CURRENT_DAY_OF_WEEK }}, {{ _CURRENT_DATE }} at {{ _CURRENT_TIME }}.
User's request: {{ userRequest }}
Please provide a helpful response considering the current time and day.Example 2: Agent Type System Prompt with Metadata
# Agent Type: Loop Decision Maker
Current execution context:
- Date/Time: {{ _CURRENT_DATE_AND_TIME }}
- User: {{ _USER_NAME }} ({{ _USER_EMAIL }})
- Environment: {{ _ENVIRONMENT }}
## Expected Output Format
{{ _OUTPUT_EXAMPLE }}
## Agent Specific Instructions
{{ agentResponse }}
Based on the above agent response and the expected output format ({{ _EXPECTED_OUTPUT_TYPE }}), determine the next step.Example 3: Debug-Friendly Prompt
[Debug Info]
- Prompt: {{ _PROMPT_NAME }}
- Description: {{ _PROMPT_DESCRIPTION }}
- Expected Output: {{ _EXPECTED_OUTPUT_TYPE }}
- User ID: {{ _USER_ID }}
- Timestamp: {{ _CURRENT_TIMESTAMP_UTC }}
[Task]
{{ taskDescription }}Adding Custom System Placeholders
You can add custom system placeholders programmatically:
import { SystemPlaceholderManager } from '@memberjunction/ai-prompts';
// Add a custom placeholder
SystemPlaceholderManager.addPlaceholder({
name: '_ORGANIZATION_NAME',
description: 'Current organization name',
getValue: async (params) => {
// Custom logic to get organization name
return params.contextUser?.OrganizationName || 'Default Organization';
}
});
// Or add directly to the array
const placeholders = SystemPlaceholderManager.getPlaceholders();
placeholders.push({
name: '_CUSTOM_VALUE',
description: 'My custom value',
getValue: async (params) => 'custom result'
});Data Merge Priority Order
When rendering templates, data is merged in this priority order (highest to lowest):
- Template-specific data (
templateDataparameter) - Child template renders (for hierarchical template composition)
- User-provided data (
dataparameter) - System placeholders (lowest priority)
This means users can override system placeholders by providing their own values with the same names.
⚡ Parallel Processing
Multi-model execution with intelligent result selection strategies and AI judge ranking for optimal results.
✅ Output Validation
JSON schema validation against OutputExample with intelligent retry logic and configurable validation behaviors.
🚫 Cancellation Support
AbortSignal integration for graceful execution cancellation with proper cleanup and partial result preservation.
📈 Progress & Streaming
Real-time progress callbacks and streaming response support for responsive user interfaces.
📊 Comprehensive Tracking
Hierarchical execution logging with the AIPromptRun entity, including token usage, timing, and validation attempts.
🤖 Agent Integration
Seamless integration with AI Agents through hierarchical prompts and execution tracking.
💾 Intelligent Caching
Vector similarity matching and TTL-based result caching for performance optimization.
🔧 Template Integration
Dynamic prompt generation with MemberJunction template system supporting conditionals, loops, and data injection.
Installation
npm install @memberjunction/ai-promptsNote: This package uses MemberJunction's class registration system. The package automatically registers its classes on import to ensure proper functionality within the MJ ecosystem.
Type Organization Update (2025)
As part of improving code organization:
- This package now imports base AI types from
@memberjunction/ai(Core) - Prompt-specific types remain in this package:
AIPromptParams,AIPromptRunResultChildPromptParam,SystemPlaceholder- Execution callbacks and progress types
- Agent integration types are imported from
@memberjunction/ai-agentswhen needed
Requirements
- Node.js 16+
- MemberJunction Core libraries
- @memberjunction/ai for base AI types and result structures
- @memberjunction/aiengine for model management and basic AI operations
- @memberjunction/templates for template rendering
Core Architecture
Dynamic vs Static Template Composition
The AI Prompts system introduces dynamic template composition that extends beyond MemberJunction's built-in static template features:
Static Template Composition (MJ Templates)
MemberJunction's template system supports embedding templates within templates through {% include %} directives. This is perfect for fixed relationships:
- Email templates with standard headers/footers
- Report templates with consistent formatting sections
- Any scenario where Template A always includes Templates B and C
Dynamic Template Composition (AI Prompts)
The AI Prompts system adds runtime template composition where relationships are determined dynamically:
- Runtime Flexibility: Inject ANY prompt template into ANY other prompt template
- Context-Aware: Choose which child templates to inject based on runtime conditions
- Agent Architecture: Combine system prompts (control flow) with agent prompts (domain logic)
- Modular Design: Build complex prompts from reusable components selected at runtime
Key Difference: While MJ Templates handle "Template A always includes B", AI Prompts handle "Template A includes X, where X is determined at runtime"
AIPromptRunner Class
The AIPromptRunner class is the central component for executing prompts with advanced features:
import { AIPromptRunner, AIPromptParams } from '@memberjunction/ai-prompts';
// Get a prompt from the system
const prompts = AIEngine.Instance.Prompts;
const summaryPrompt = prompts.find(p => p.Name === 'Document Summarization');
// Execute the prompt
const params: AIPromptParams = {
prompt: summaryPrompt,
data: {
documentText: "Long document content here...",
targetLength: "2 paragraphs"
},
contextUser: currentUser
};
const runner = new AIPromptRunner();
const result = await runner.ExecutePrompt(params);
if (result.success) {
console.log("Summary:", result.result);
console.log(`Execution time: ${result.executionTimeMS}ms`);
console.log(`Prompt tokens: ${result.promptTokens}`);
console.log(`Completion tokens: ${result.completionTokens}`);
console.log(`Total tokens: ${result.tokensUsed}`);
if (result.cost) {
console.log(`Cost: ${result.cost} ${result.costCurrency || 'USD'}`);
}
} else {
console.error("Error:", result.errorMessage);
}Quick Start
1. Basic Prompt Execution
import { AIPromptRunner } from '@memberjunction/ai-prompts';
import { AIEngine } from '@memberjunction/aiengine';
// Initialize the AI Engine
await AIEngine.Instance.Config(false, currentUser);
// Find a prompt
const prompt = AIEngine.Instance.Prompts.find(p => p.Name === 'Text Analysis');
// Execute with data
const runner = new AIPromptRunner();
const result = await runner.ExecutePrompt({
prompt: prompt,
data: {
text: "Analyze this sample text for sentiment and key themes.",
format: "bullet points"
},
contextUser: currentUser
});
console.log("Analysis:", result.result);2. Template-Driven Prompts
// Prompt templates support dynamic data substitution
const templatePrompt = {
UserMessage: `Analyze the {{entity.EntityType}} record for {{entity.Name}}.
Focus on {{analysisType}} and provide insights about {{entity.Description}}.`
};
// Data context provides template variables
const result = await runner.ExecutePrompt({
prompt: templatePrompt,
data: {
entity: {
EntityType: "Customer",
Name: "Acme Corp",
Description: "Enterprise software company"
},
analysisType: "growth opportunities"
},
contextUser: currentUser
});3. Parallel Execution with Multiple Models
// Execute the same prompt across multiple models in parallel
const multiModelPrompt = prompts.find(p => p.ParallelizationMode === 'ModelSpecific');
const result = await runner.ExecutePrompt({
prompt: multiModelPrompt,
data: { query: "Analyze this data pattern" },
contextUser: currentUser
});
// When using parallel execution, the system automatically selects the best result
console.log(`Final result: ${result.result}`);
console.log(`Execution time: ${result.executionTimeMS}ms`);
console.log(`Total tokens used: ${result.tokensUsed}`);
// The promptRun entity contains metadata about parallel execution in its Messages field
if (result.promptRun?.Messages) {
const metadata = JSON.parse(result.promptRun.Messages);
if (metadata.parallelExecution) {
console.log(`Parallelization mode: ${metadata.parallelExecution.parallelizationMode}`);
console.log(`Total tasks: ${metadata.parallelExecution.totalTasks}`);
console.log(`Successful tasks: ${metadata.parallelExecution.successfulTasks}`);
}
}4. Dynamic Template Composition for AI Agents
This example demonstrates the primary use case for dynamic template composition - the AI Agent system:
import { AIPromptRunner, ChildPromptParam } from '@memberjunction/ai-prompts';
// Agent Type System Prompt - Controls execution flow and response format
const agentTypeSystemPrompt = {
Name: "Data Analysis Agent Type System Prompt",
TemplateID: "system-prompt-template-id",
// Template contains: "You are an AI agent. {{ agentInstructions }} Respond with JSON..."
};
// Individual Agent Prompt - Contains domain-specific logic
const specificAgentPrompt = {
Name: "Customer Churn Analysis Agent",
TemplateID: "churn-agent-template-id",
// Template contains: "Analyze customer data for churn risk factors..."
};
// At runtime, dynamically compose the prompts
const runner = new AIPromptRunner();
const result = await runner.ExecutePrompt({
prompt: agentTypeSystemPrompt, // Parent template
childPrompts: [
// Dynamically inject the specific agent's instructions
new ChildPromptParam(specificAgentPrompt, 'agentInstructions')
],
data: {
customerData: analysisData,
thresholds: { churnRisk: 0.7 }
},
contextUser: currentUser
});
// The system executed ONE prompt that combined:
// 1. System prompt wrapper (control flow)
// 2. Specific agent instructions (domain logic)
// 3. Runtime data
console.log("Agent decision:", result.result);Why This Matters:
- Different agents can use the SAME system prompt template
- System prompt enforces consistent response format across all agents
- Agent-specific logic is cleanly separated and reusable
- Runtime composition allows flexible agent architectures
5. Complete Example with All New Features
import { AIPromptRunner } from '@memberjunction/ai-prompts';
import { AIEngine } from '@memberjunction/aiengine';
// Complete example showcasing all Phase 6 enhancements
async function comprehensivePromptExecution() {
// Initialize
await AIEngine.Instance.Config(false, currentUser);
const runner = new AIPromptRunner();
// Set up cancellation (e.g., from user clicking cancel button)
const controller = new AbortController();
const timeoutId = setTimeout(() => {
controller.abort();
console.log('Operation timed out after 2 minutes');
}, 120000);
try {
const result = await runner.ExecutePrompt({
prompt: complexAnalysisPrompt, // ParallelizationMode: 'ModelSpecific'
data: {
document: largeDocument,
analysisType: 'comprehensive',
outputFormat: 'structured'
},
contextUser: currentUser,
// Enable cancellation
cancellationToken: controller.signal,
// Track progress throughout execution
onProgress: (progress) => {
console.log(`[${progress.step}] ${progress.percentage}% - ${progress.message}`);
// Handle parallel execution progress
if (progress.metadata?.parallelExecution) {
const parallel = progress.metadata.parallelExecution;
console.log(` → Group ${parallel.currentGroup + 1}/${parallel.totalGroups}, Tasks: ${parallel.completedTasks}/${parallel.totalTasks}`);
}
// Update UI
updateProgressBar(progress.percentage);
updateStatusText(progress.message);
},
// Receive streaming content updates
onStreaming: (chunk) => {
if (chunk.isComplete) {
console.log(`Streaming complete for ${chunk.modelName}`);
finalizeOutput();
} else {
// Show real-time content generation
console.log(`[${chunk.modelName}]: ${chunk.content.substring(0, 50)}...`);
appendToDisplay(chunk.content, chunk.taskId);
}
}
});
// Clear timeout since we completed successfully
clearTimeout(timeoutId);
// Handle different result scenarios
if (result.cancelled) {
console.log(`Execution cancelled: ${result.cancellationReason}`);
// May still have partial results available
if (result.additionalResults && result.additionalResults.length > 0) {
console.log(`${result.additionalResults.length} partial results available`);
}
} else if (result.success) {
console.log('Execution completed successfully!');
console.log(`Primary result from ${result.modelInfo?.modelName}: ${result.result}`);
// Analyze judge selection if multiple results
if (result.ranking && result.judgeRationale) {
console.log(`Selected as #${result.ranking} by AI judge: ${result.judgeRationale}`);
}
// Review alternative results from parallel execution
if (result.additionalResults) {
console.log(`${result.additionalResults.length} alternative results ranked by judge:`);
result.additionalResults.forEach((altResult, index) => {
console.log(` ${altResult.ranking}. ${altResult.modelInfo?.modelName}: ${altResult.judgeRationale}`);
});
}
// Analyze execution performance using hierarchical logging
if (result.promptRun?.RunType === 'ParallelParent') {
await analyzeParallelExecutionPerformance(result.promptRun.ID);
}
// Check streaming and caching
if (result.wasStreamed) {
console.log('Response was streamed in real-time');
}
if (result.cacheInfo?.cacheHit) {
console.log(`Result served from cache: ${result.cacheInfo.cacheSource}`);
}
} else {
console.error(`Execution failed: ${result.errorMessage}`);
}
} catch (error) {
clearTimeout(timeoutId);
console.error('Execution error:', error.message);
}
}
// Helper function to analyze parallel execution performance
async function analyzeParallelExecutionPerformance(parentPromptRunId: string) {
// Query hierarchical logs to understand execution breakdown
console.log('Analyzing parallel execution performance...');
// This would typically be a database query or API call
// For demonstration, showing the concept:
const analysisQuery = `
SELECT
pr.RunType,
pr.ExecutionOrder,
pr.Success,
pr.ExecutionTimeMS,
pr.TokensUsed,
m.Name as ModelName
FROM AIPromptRun pr
JOIN AIModel m ON pr.ModelID = m.ID
WHERE pr.ParentID = '${parentPromptRunId}' OR pr.ID = '${parentPromptRunId}'
ORDER BY pr.RunType, pr.ExecutionOrder
`;
console.log('Performance analysis query:', analysisQuery);
// Execute query and analyze results...
}
// Execute the comprehensive example
comprehensivePromptExecution().catch(console.error);Advanced Features
Intelligent Failover System
The AI Prompt Runner includes a sophisticated failover system that automatically handles provider outages, rate limits, and service degradation. This ensures your AI-powered applications remain resilient and responsive even when individual providers experience issues.
How Failover Works
When a prompt execution fails, the system:
- Analyzes the error using the ErrorAnalyzer to determine if failover is appropriate
- Selects alternative models/vendors based on the configured strategy
- Applies intelligent delays with exponential backoff to prevent overwhelming providers
- Tracks all attempts for debugging and analysis
- Updates the execution to use the successful model/vendor combination
Failover Configuration
Configure failover behavior at the prompt level:
// Database columns added to AIPrompt entity:
FailoverStrategy: 'SameModelDifferentVendor' | 'NextBestModel' | 'PowerRank' | 'None'
FailoverMaxAttempts: number // Maximum failover attempts (default: 3)
FailoverDelaySeconds: number // Initial delay between attempts (default: 1)
FailoverModelStrategy: 'PreferSameModel' | 'PreferDifferentModel' | 'RequireSameModel'
FailoverErrorScope: 'All' | 'NetworkOnly' | 'RateLimitOnly' | 'ServiceErrorOnly'Failover Strategies Explained
SameModelDifferentVendor: Ideal for multi-cloud deployments
// Example: Claude from different providers
// Primary: Anthropic API
// Failover 1: AWS Bedrock
// Failover 2: Google Vertex AINextBestModel: Balances capability and availability
// Example: Gradual capability reduction
// Primary: GPT-4-turbo
// Failover 1: Claude-3-opus
// Failover 2: GPT-3.5-turboPowerRank: Uses MemberJunction's model power rankings
// Automatically selects models based on their PowerRank scores
// Ensures you always get the best available modelError Scope Configuration
Control which types of errors trigger failover:
- All: Any error triggers failover (most resilient)
- NetworkOnly: Only network/connection errors
- RateLimitOnly: Only rate limit errors (429 status)
- ServiceErrorOnly: Only service errors (500, 503 status)
Failover Tracking
The system comprehensively tracks failover attempts in the database:
// AIPromptRun entity tracking fields:
OriginalModelID: string // The initially selected model
OriginalRequestStartTime: Date // When the request started
FailoverAttempts: number // Number of failover attempts made
FailoverErrors: string (JSON) // Detailed error information for each attempt
FailoverDurations: string (JSON) // Duration of each attempt in milliseconds
TotalFailoverDuration: number // Total time spent in failoverAdvanced Failover Customization
The AIPromptRunner exposes protected methods for advanced customization:
class CustomPromptRunner extends AIPromptRunner {
// Override to implement custom failover configuration
protected getFailoverConfiguration(prompt: AIPromptEntity): FailoverConfiguration {
// Add environment-specific logic
if (process.env.NODE_ENV === 'production') {
return {
strategy: 'SameModelDifferentVendor',
maxAttempts: 5,
delaySeconds: 2,
modelStrategy: 'PreferSameModel',
errorScope: 'NetworkOnly'
};
}
return super.getFailoverConfiguration(prompt);
}
// Override to implement custom failover decision logic
protected shouldAttemptFailover(
error: Error,
config: FailoverConfiguration,
attemptNumber: number
): boolean {
// Add custom error analysis
if (error.message.includes('quota_exceeded')) {
return false; // Don't retry quota errors
}
return super.shouldAttemptFailover(error, config, attemptNumber);
}
// Override to implement custom delay calculation
protected calculateFailoverDelay(
attemptNumber: number,
baseDelaySeconds: number,
previousError?: Error
): number {
// Custom backoff strategy
if (previousError?.message.includes('rate_limit')) {
return 60000; // 1 minute for rate limits
}
return super.calculateFailoverDelay(attemptNumber, baseDelaySeconds, previousError);
}
}Failover Best Practices
- Configure Appropriately: Use
NetworkOnlyorRateLimitOnlyfor production to avoid retrying invalid requests - Set Reasonable Attempts: 3-5 attempts typically sufficient
- Monitor Failover Patterns: Query the tracking data to identify problematic providers
- Test Failover Scenarios: Simulate provider outages in development
- Consider Costs: Failover may route to more expensive providers
Example: Production-Ready Configuration
const productionPrompt = {
Name: "Customer Service Assistant",
FailoverStrategy: "SameModelDifferentVendor",
FailoverMaxAttempts: 4,
FailoverDelaySeconds: 2,
FailoverModelStrategy: "PreferSameModel",
FailoverErrorScope: "NetworkOnly",
// Ensure failover stays within approved models
MinPowerRank: 85
};
// Query failover performance
const failoverStats = await runView.RunView({
EntityName: 'MJ: AI Prompt Runs',
ExtraFilter: `FailoverAttempts > 0 AND RunAt >= '2024-01-01'`,
OrderBy: 'RunAt DESC'
});
// Analyze which vendors are most reliable
SELECT
OriginalModelID,
ModelID as FinalModelID,
COUNT(*) as FailoverCount,
AVG(TotalFailoverDuration) as AvgFailoverTime
FROM AIPromptRun
WHERE FailoverAttempts > 0
GROUP BY OriginalModelID, ModelID
ORDER BY FailoverCount DESC;Configuration-Aware Failover
The failover system respects AIConfiguration boundaries to ensure environment-specific models stay isolated:
How It Works:
- When you specify a
configurationId, the system builds a candidate list with two priority tiers:- Configuration-specific models (priority 5000+): Models assigned to your configuration
- NULL configuration models (priority 2000+): Universal fallback models available to all configurations
Example Setup:
-- Production Configuration: Only approved production models
INSERT INTO AIPromptModel (PromptID, ModelID, ConfigurationID, Priority)
VALUES
(@PromptID, @Claude35SonnetID, @ProductionConfigID, 100),
(@PromptID, @GPT4ID, @ProductionConfigID, 90);
-- Development Configuration: Include experimental models
INSERT INTO AIPromptModel (PromptID, ModelID, ConfigurationID, Priority)
VALUES
(@PromptID, @LlamaExperimentalID, @DevelopmentConfigID, 100);
-- NULL Configuration: Universal fallbacks for all environments
INSERT INTO AIPromptModel (PromptID, ModelID, ConfigurationID, Priority)
VALUES
(@PromptID, @Claude3HaikuID, NULL, 100),
(@PromptID, @GPT35TurboID, NULL, 90);Failover Behavior:
// Execute with Production configuration
const result = await runner.ExecutePrompt({
prompt: myPrompt,
configurationId: productionConfigID,
data: { query: 'Analyze this' }
});
// Failover order:
// 1. Try Claude 3.5 Sonnet (Production config, priority 5100)
// 2. Try GPT-4 (Production config, priority 5090)
// 3. Try Claude 3 Haiku (NULL config fallback, priority 2100)
// 4. Try GPT-3.5 Turbo (NULL config fallback, priority 2090)
// ✅ Never crosses to Development config modelsKey Benefits:
- Environment Isolation: Production models never failover to development/experimental models
- Controlled Fallback: Explicit hierarchy from config-specific to universal fallbacks
- Performance: Candidate list built once and cached, no rebuilding during failover
- Consistency: Same candidate list used for initial selection and all failover attempts
Intelligent Caching
The prompt system provides sophisticated caching with vector similarity matching:
// Caching is automatically handled based on prompt configuration:
// - EnableCaching: Whether to use caching for this prompt
// - CacheMatchType: 'Exact' or 'Vector' similarity matching
// - CacheTTLSeconds: Time-to-live for cached results
// - CacheMustMatchModel/Vendor/Agent: Cache constraint options
// Vector similarity allows reusing results for semantically similar prompts
// even if the exact text differs
const cachedPrompt = {
Name: "Smart Summary",
EnableCaching: true,
CacheMatchType: "Vector",
CacheTTLSeconds: 3600,
CacheSimilarityThreshold: 0.85,
CacheMustMatchModel: true,
CacheMustMatchVendor: false
};Parallel Execution Strategies
The system supports multiple parallelization modes:
// Prompts can be configured for parallel execution:
// - ParallelizationMode: 'None', 'StaticCount', 'ConfigParam', 'ModelSpecific'
// - ParallelCount: Number of parallel executions
// - ExecutionGroups: Sequential group execution with parallel tasks within groups
// Example configurations:
// Static parallel count
const staticParallelPrompt = {
ParallelizationMode: "StaticCount",
ParallelCount: 3
};
// Configuration-driven count
const configParallelPrompt = {
ParallelizationMode: "ConfigParam",
ParallelConfigParam: "analysis_parallel_count"
};
// Model-specific configuration
const modelSpecificPrompt = {
ParallelizationMode: "ModelSpecific",
// Uses settings from AIPromptModel entries
};Result Selection Strategies
// The engine supports multiple result selection methods:
// - 'First': Use the first successful result
// - 'Random': Randomly select from successful results
// - 'PromptSelector': Use AI to select the best result
// - 'Consensus': Select result with highest agreement
// Result selector prompts can be configured to intelligently choose
// the best result from parallel executions
const selectorPrompt = {
Name: "Best Result Selector",
PromptText: `
You are evaluating multiple AI responses to select the best one.
Original query: {{originalQuery}}
Responses:
{{#each responses}}
Response {{@index}}: {{this}}
{{/each}}
Select the response number (0-based) that is most accurate, helpful, and well-written.
Return only the number.
`,
OutputType: "number"
};
const mainPrompt = {
ParallelizationMode: "StaticCount",
ParallelCount: 3,
ResultSelectorPromptID: selectorPrompt.ID
};Output Validation
// Configure structured output validation
const validatedPrompt = {
Name: "Structured Analysis",
OutputType: "object",
OutputExample: {
sentiment: "positive|negative|neutral",
confidence: 0.95,
keyThemes: ["theme1", "theme2"],
summary: "Brief summary text"
},
ValidationBehavior: "Strict",
MaxRetries: 3,
RetryDelayMS: 1000,
RetryStrategy: "exponential"
};
// Validation is automatically applied
const result = await runner.ExecutePrompt({
prompt: validatedPrompt,
data: { text: "Content to analyze" },
contextUser: currentUser,
skipValidation: false // Validation enabled
});
// Result.result will be validated against the expected structureValidation Syntax Cleaning
When using output validation with JSON responses, the AI Prompt Runner automatically handles validation syntax that AI models might inadvertently include in their JSON keys:
// Validation syntax in prompts:
// - name?: optional field
// - items:[2+]: array with minimum 2 items
// - status:!empty: non-empty required field
// - count:number: field with type hint
// If the AI returns JSON with validation syntax in keys:
{
"name?": "John Doe",
"items:[2+]": ["apple", "banana", "orange"],
"status:!empty": "active",
"count:number": 42
}
// The system automatically cleans it to:
{
"name": "John Doe",
"items": ["apple", "banana", "orange"],
"status": "active",
"count": 42
}Automatic Cleaning Behavior:
- Always enabled when prompt has
ValidationBehaviorset to"Strict"or"Warn" - Always enabled when prompt has an
OutputExampledefined - Optional for prompts with
ValidationBehaviorset to"None"(viacleanValidationSyntaxparameter)
This ensures that validation patterns used in prompt templates don't interfere with the actual JSON structure returned by the AI model.
### Template Integration
Advanced template features with the MemberJunction template system:
```typescript
// Complex template with conditionals and loops
const advancedTemplate = {
PromptText: `
Analyze the following {{entityType}} records:
{{#each records}}
{{@index + 1}}. {{this.Name}}
Status: {{this.Status}}
{{#if this.Priority}}Priority: {{this.Priority}}{{/if}}
{{#each this.Tags}}
- Tag: {{this}}
{{/each}}
{{/each}}
{{#if includeRecommendations}}
Please provide recommendations for improvement.
{{/if}}
Focus on: {{analysisAreas.join(", ")}}
`
};
const result = await runner.ExecutePrompt({
prompt: advancedTemplate,
data: {
entityType: "Customer",
records: customerData,
includeRecommendations: true,
analysisAreas: ["revenue potential", "risk factors", "engagement"]
},
contextUser: currentUser
});Parallel Execution System
The package includes sophisticated parallel execution capabilities through specialized classes that work together to manage complex multi-model executions.
Note: The ExecutionPlanner and ParallelExecutionCoordinator are internal components used by AIPromptRunner. They are not directly exposed in the public API but understanding their operation helps in configuring prompts effectively.
ExecutionPlanner (Internal)
The ExecutionPlanner class analyzes prompt configuration and creates optimal execution strategies:
Key Responsibilities:
- Analyzes parallelization modes (None, StaticCount, ConfigParam, ModelSpecific)
- Creates execution groups for coordinated processing
- Determines optimal task distribution based on model availability
- Assigns priorities and manages execution order
- Handles model selection based on power rankings and configuration
Execution Plan Creation:
- For
StaticCount: Creates N parallel tasks using available models - For
ConfigParam: Uses configuration parameters to determine parallel count - For
ModelSpecific: Uses AIPromptModel entries to define exact model usage - Supports execution groups for sequential/parallel hybrid execution
ParallelExecutionCoordinator (Internal)
The ParallelExecutionCoordinator orchestrates the actual execution of tasks created by the ExecutionPlanner:
Core Features:
- Manages concurrency limits (default: 5 concurrent executions)
- Implements retry logic with exponential backoff
- Handles partial result collection when some tasks fail
- Provides comprehensive execution metrics and timing
- Supports fail-fast mode for critical operations
Execution Flow:
- Groups tasks by execution group number
- Executes groups sequentially (group 0, then 1, then 2, etc.)
- Within each group, executes tasks in parallel up to concurrency limit
- Collects and aggregates results from all executions
- Applies result selection strategy if multiple results available
Supported Parallelization Modes
- None: Traditional single execution
- StaticCount: Fixed number of parallel executions
- ConfigParam: Dynamic parallel count from configuration
- ModelSpecific: Individual model configurations with execution groups
// Example of model-specific parallel configuration
const modelSpecificExecution = {
prompt: complexPrompt,
data: analysisData,
contextUser: currentUser
};
// The system will:
// 1. Query AIPromptModel entries for this prompt
// 2. Group executions by ExecutionGroup
// 3. Execute groups sequentially, models within groups in parallel
// 4. Apply result selection strategy
const result = await runner.ExecutePrompt(modelSpecificExecution);Performance Monitoring & Analytics
Comprehensive tracking and analytics for prompt executions:
// Execution results include detailed metrics
const result = await runner.ExecutePrompt(params);
console.log(`Execution time: ${result.executionTimeMS}ms`);
console.log(`Tokens used: ${result.tokensUsed}`);
// The AIPromptRunResult includes execution tracking
if (result.promptRun) {
console.log(`Prompt Run ID: ${result.promptRun.ID}`);
console.log(`Model used: ${result.promptRun.ModelID}`);
console.log(`Configuration: ${result.promptRun.ConfigurationID}`);
}Early Run ID Callback
Get the PromptRun ID immediately after creation for real-time monitoring:
const params = new AIPromptParams();
params.prompt = myPrompt;
params.data = { query: 'Analyze this data' };
// Callback fired immediately after PromptRun record is saved
params.onPromptRunCreated = async (promptRunId) => {
console.log(`Prompt run started: ${promptRunId}`);
// Use cases:
// - Link to parent records (e.g., AIAgentRunStep.TargetLogID)
// - Send to monitoring systems
// - Update UI with tracking info
// - Start real-time log streaming
};
const result = await runner.ExecutePrompt(params);The callback is invoked:
- When: Right after the AIPromptRun record is created and saved
- Before: The actual AI model execution begins
- Error Handling: Callback errors are logged but don't fail the execution
- Async Support: Can be synchronous or asynchronous
AI Prompt Run Logging
The AI Prompt Runner implements a sophisticated hierarchical logging system that tracks all execution activities in the database through the AIPromptRun entity. This system provides complete traceability and analytics for both simple and complex parallel executions.
Hierarchical Logging Structure
The logging system uses a parent-child relationship model with different RunType values to represent the execution hierarchy:
Single: Standard single-model executionParallelParent: Parent record for parallel execution coordinating multiple modelsParallelChild: Individual model execution within a parallel runResultSelector: AI judge execution that selects the best result from parallel executions
RunType Values and Relationships
// Single execution - no parent relationship
{
RunType: 'Single',
ParentID: null,
ExecutionOrder: null
}
// Parallel execution creates a hierarchical structure:
// 1. Parent record coordinates the overall execution
{
RunType: 'ParallelParent',
ParentID: null,
ExecutionOrder: null
}
// 2. Child records for each model execution
{
RunType: 'ParallelChild',
ParentID: '12345-parent-id',
ExecutionOrder: 0 // Order within execution group
}
// 3. Result selector judges the best result
{
RunType: 'ResultSelector',
ParentID: '12345-parent-id',
ExecutionOrder: 5 // After all parallel children
}Database Schema Fields
Key fields in the AIPromptRun entity for hierarchical logging:
-- Core execution tracking
PromptID uniqueidentifier -- Prompt being executed
ModelID uniqueidentifier -- AI model used
VendorID uniqueidentifier -- Vendor providing the model
RunAt datetime2 -- Execution start time
CompletedAt datetime2 -- Execution completion time
-- Hierarchical logging fields
RunType nvarchar(50) -- 'Single', 'ParallelParent', 'ParallelChild', 'ResultSelector'
ParentID uniqueidentifier -- Parent prompt run ID (NULL for top-level)
ExecutionOrder int -- Order within parallel execution group
-- Results and metrics
Success bit -- Whether execution succeeded
Result nvarchar(max) -- Raw result from AI model
ErrorMessage nvarchar(500) -- Error message if failed
ExecutionTimeMS int -- Total execution time
TokensUsed int -- Total tokens consumed
TokensPrompt int -- Prompt tokens used
TokensCompletion int -- Completion tokens generated
-- Cost tracking
Cost decimal(19,8) -- Cost of this specific execution
CostCurrency nvarchar(10) -- ISO 4217 currency code (USD, EUR, etc.)
-- Hierarchical rollup fields (NEW)
TokensUsedRollup int -- Total tokens including all children
TokensPromptRollup int -- Total prompt tokens including all children
TokensCompletionRollup int -- Total completion tokens including all children
-- Note: TotalCost (existing field) serves as the cost rollup
-- Context and configuration
Messages nvarchar(max) -- JSON with input data and metadata
ConfigurationID uniqueidentifier -- Environment configuration used
AgentRunID uniqueidentifier -- Links to parent AIAgentRun if applicableHierarchical Token and Cost Tracking
The AI Prompts system implements a sophisticated rollup pattern for tracking token usage and costs across hierarchical prompt executions:
Prompt Execution Rollup Pattern
For hierarchical prompt executions (parent prompts with child prompts), each node in the tree contains:
- Direct fields (
TokensPrompt,TokensCompletion,Cost): Usage for just that execution - Rollup fields (
TokensPromptRollup,TokensCompletionRollup,TotalCost): Total including all descendants
Example:
Parent Prompt (100 prompt, 200 completion tokens, $0.05)
├── Child A (50 prompt, 100 completion, $0.02)
└── Child B (75 prompt, 150 completion, $0.03)
Database records:
- Parent: TokensPrompt=100, TokensPromptRollup=225 (100+50+75)
TokensCompletion=200, TokensCompletionRollup=450 (200+100+150)
Cost=0.05, TotalCost=0.10 (0.05+0.02+0.03)
- Child A: TokensPrompt=50, TokensPromptRollup=50 (leaf node)
Cost=0.02, TotalCost=0.02 (leaf node)
- Child B: TokensPrompt=75, TokensPromptRollup=75 (leaf node)
Cost=0.03, TotalCost=0.03 (leaf node)This enables efficient queries like:
- "What was the total cost of this hierarchical prompt?" → Check root's
TotalCost - "How many tokens did this sub-prompt and its children use?" → Check that node's rollup fields
- No complex SQL joins or recursive CTEs needed!
Agent Run Token Tracking
The AIAgentRun entity tracks aggregate token usage across all prompt executions during an agent's lifecycle:
-- New fields in AIAgentRun
TotalTokensUsed int -- Total tokens (existing)
TotalPromptTokensUsed int -- Breakdown: prompt tokens (NEW)
TotalCompletionTokensUsed int -- Breakdown: completion tokens (NEW)
TotalCost decimal -- Total cost (existing)
-- Hierarchical agent rollup fields (NEW)
TotalTokensUsedRollup int -- Including sub-agent runs
TotalPromptTokensUsedRollup int -- Including sub-agent runs
TotalCompletionTokensUsedRollup int -- Including sub-agent runs
TotalCostRollup decimal -- Including sub-agent runsAgent Hierarchy Example:
Parent Agent (A)
├── Own prompts: 200 prompt, 400 completion tokens
├── Sub-Agent (B)
│ └── Own prompts: 100 prompt, 200 completion tokens
└── Sub-Agent (C)
└── Own prompts: 150 prompt, 300 completion tokens
Rollup values:
- Agent A: TotalPromptTokensUsedRollup = 450 (200+100+150)
TotalCompletionTokensUsedRollup = 900 (400+200+300)
- Agent B: TotalPromptTokensUsedRollup = 100 (leaf agent)
- Agent C: TotalPromptTokensUsedRollup = 150 (leaf agent)Querying Hierarchical Log Data
The hierarchical structure enables powerful analytics queries:
-- Get all executions for a parallel run
SELECT
pr.ID,
pr.RunType,
pr.ExecutionOrder,
pr.Success,
pr.ExecutionTimeMS,
pr.TokensUsed,
m.Name as ModelName,
p.Name as PromptName
FROM AIPromptRun pr
JOIN AIModel m ON pr.ModelID = m.ID
JOIN AIPrompt p ON pr.PromptID = p.ID
WHERE pr.ParentID = '12345-parent-id'
OR pr.ID = '12345-parent-id'
ORDER BY pr.RunType, pr.ExecutionOrder;
-- Analyze parallel execution performance
WITH ParallelStats AS (
SELECT
ParentID,
COUNT(*) as TotalChildren,
SUM(CASE WHEN Success = 1 THEN 1 ELSE 0 END) as SuccessfulChildren,
AVG(ExecutionTimeMS) as AvgExecutionTime,
SUM(TokensUsed) as TotalTokens
FROM AIPromptRun
WHERE RunType = 'ParallelChild'
AND ParentID IS NOT NULL
GROUP BY ParentID
)
SELECT
parent.ID as ParentRunID,
parent.RunAt,
parent.ExecutionTimeMS as ParentExecutionTime,
stats.TotalChildren,
stats.SuccessfulChildren,
stats.AvgExecutionTime,
stats.TotalTokens,
prompt.Name as PromptName
FROM AIPromptRun parent
JOIN ParallelStats stats ON parent.ID = stats.ParentID
JOIN AIPrompt prompt ON parent.PromptID = prompt.ID
WHERE parent.RunType = 'ParallelParent'
ORDER BY parent.RunAt DESC;
-- Find failed executions with context
SELECT
pr.ID,
pr.RunType,
pr.ParentID,
pr.ErrorMessage,
pr.ExecutionTimeMS,
m.Name as ModelName,
v.Name as VendorName,
p.Name as PromptName
FROM AIPromptRun pr
JOIN AIModel m ON pr.ModelID = m.ID
LEFT JOIN AIVendor v ON pr.VendorID = v.ID
JOIN AIPrompt p ON pr.PromptID = p.ID
WHERE pr.Success = 0
ORDER BY pr.RunAt DESC;Cancellation Support
The AI Prompt Runner provides comprehensive cancellation support through the standard JavaScript AbortSignal and AbortController pattern, enabling graceful termination of long-running operations.
Understanding AbortSignal in Prompt Execution
The AbortSignal pattern separates cancellation control from cancellation handling:
- Your Code (Controller): Creates the
AbortControllerand decides when to cancel - Prompt Runner (Worker): Receives the
AbortSignaltoken and handles how to cancel gracefully
This separation allows for flexible cancellation from multiple sources (user actions, timeouts, resource limits) while the Prompt Runner handles the complex cleanup across parallel executions, model calls, and result selection.
The Pattern Flow:
Controller (Your Code) → AbortController.signal → AIPromptRunner
↓ ↓ ↓
Decides WHEN The "Red Phone" Handles HOW
to cancel Token to stopBasic Cancellation Usage
import { AIPromptRunner } from '@memberjunction/ai-prompts';
// Create cancellation controller
const controller = new AbortController();
const cancellationToken = controller.signal;
// Set up cancellation after 30 seconds
setTimeout(() => {
controller.abort();
console.log('Prompt execution cancelled due to timeout');
}, 30000);
// Execute prompt with cancellation support
const runner = new AIPromptRunner();
const result = await runner.ExecutePrompt({
prompt: myPrompt,
data: { query: 'Long running analysis...' },
contextUser: currentUser,
cancellationToken: cancellationToken
});
// Check if execution was cancelled
if (result.cancelled) {
console.log(`Execution cancelled: ${result.cancellationReason}`);
console.log('Partial results may be available');
} else if (result.success) {
console.log('Execution completed successfully');
}Cancellation in Parallel Execution
Cancellation works seamlessly with parallel execution, allowing you to stop all running tasks:
const controller = new AbortController();
// User clicks cancel button
document.getElementById('cancelButton').onclick = () => {
controller.abort();
};
// Execute parallel prompt with multiple models
const result = await runner.ExecutePrompt({
prompt: parallelPrompt, // ParallelizationMode: 'ModelSpecific'
data: analysisData,
contextUser: currentUser,
cancellationToken: controller.signal
});
// Parallel cancellation behavior:
// - Tasks not yet started will be marked as cancelled
// - Currently executing tasks will be terminated
// - Completed tasks remain in the results
// - Partial results may still be available for analysisMultiple Cancellation Sources
One of the powerful aspects of the AbortSignal pattern is that multiple sources can cancel the same operation:
async function intelligentPromptExecution() {
const controller = new AbortController();
const signal = controller.signal;
// 1. User cancel button
document.getElementById('cancelBtn')?.addEventListener('click', () => {
controller.abort(); // User-initiated cancellation
console.log('User cancelled the operation');
});
// 2. Timeout cancellation (prevent runaway prompts)
const timeout = setTimeout(() => {
controller.abort(); // Timeout cancellation
console.log('Operation timed out after 2 minutes');
}, 120000);
// 3. Resource limit cancellation
const memoryCheck = setInterval(async () => {
if (await getMemoryUsage() > MAX_MEMORY_THRESHOLD) {
controller.abort(); // Resource limit cancellation
console.log('Cancelled due to memory limits');
}
}, 5000);
// 4. Window unload cancellation (cleanup on page close)
window.addEventListener('beforeunload', () => {
controller.abort(); // Page closing cancellation
});
try {
const result = await runner.ExecutePrompt({
prompt: complexAnalysisPrompt,
data: largeDataset,
cancellationToken: signal // One token, many cancel sources!
});
// Clean up timers if successful
clearTimeout(timeout);
clearInterval(memoryCheck);
return result;
} catch (error) {
// The Prompt Runner doesn't know WHY it was cancelled
// It just knows it should stop gracefully
console.log('Prompt execution was cancelled:', error.message);
} finally {
clearTimeout(timeout);
clearInterval(memoryCheck);
}
}Cancellation in Component-Based UIs
Perfect for React, Angular, or Vue components:
class PromptExecutionComponent {
private currentController: AbortController | null = null;
private isExecuting: boolean = false;
async executePrompt(prompt: AIPromptEntity, data: any) {
// Cancel any existing execution
this.cancelCurrentExecution();
// Create new controller for this execution
this.currentController = new AbortController();
this.isExecuting = true;
try {
const result = await this.runner.ExecutePrompt({
prompt,
data,
cancellationToken: this.currentController.signal,
onProgress: (progress) => {
this.updateUI(`${progress.step}: ${progress.percentage}%`);
},
onStreaming: (chunk) => {
this.appendStreamingContent(chunk.content);
}
});
this.handleSuccess(result);
} catch (error) {
if (error.message.includes('cancelled')) {
this.handleCancellation();
} else {
this.handleError(error);
}
} finally {
this.isExecuting = false;
this.currentController = null;
}
}
// Called when user clicks "Cancel" or navigates away
cancelCurrentExecution() {
if (this.currentController && this.isExecuting) {
this.currentController.abort();
console.log('Cancelled current prompt execution');
}
}
// Component cleanup
ngOnDestroy() { // Angular example
this.cancelCurrentExecution();
}
}Integration with BaseLLM Cancellation
The cancellation token is automatically propagated through the entire execution chain:
// Cancellation Flow in MemberJunction AI Architecture:
//
// 1. User Code (AbortController.signal)
// ↓
// 2. AIPromptRunner.ExecutePrompt(cancellationToken)
// ↓
// 3. ParallelExecutionCoordinator.executeTasksInParallel(cancellationToken)
// ↓
// 4. Individual Task Execution with cancellation
// ↓
// 5. BaseLLM.ChatCompletion({ cancellationToken })
// ↓
// 6. Provider-specific cancellation (fetch signal, Promise.race)
// ↓
// 7. AI Model API cancellation (if supported)
// At each level, cancellation is handled appropriately:
const internalFlow = {
// Level 1: Prompt Runner checks before major operations
promptRunner: () => {
if (cancellationToken?.aborted) {
return { success: false, cancelled: true };
}
},
// Level 2: Parallel coordinator cancels remaining tasks
parallelCoordinator: () => {
tasks.forEach(task => {
if (cancellationToken?.aborted) {
task.cancelled = true;
}
});
},
// Level 3: BaseLLM uses Promise.race for instant cancellation
baseLLM: () => {
return Promise.race([
actualModelCall(params),
cancellationPromise(cancellationToken)
]);
},
// Level 4: Native provider cancellation (where supported)
provider: () => {
fetch(apiUrl, {
signal: cancellationToken // Native browser/Node.js cancellation
});
}
};Cancellation Guarantees
The AI Prompt Runner provides these cancellation guarantees:
- 🚫 Instant Recognition: Cancellation requests are checked at multiple points throughout execution
- 🧹 Graceful Cleanup: Partial results are preserved and returned when possible
- 📊 Proper Logging: Cancelled operations are logged with appropriate status and metadata
- 💾 Resource Release: Network connections and memory are cleaned up promptly
- 🔄 State Consistency: The system remains in a consistent state after cancellation
Key Benefits:
- Responsive UI: Users get immediate feedback when cancelling operations
- Resource Efficiency: Prevents wasted compute and API costs
- System Stability: Avoids memory leaks and hanging operations
- Standard Pattern: Uses native JavaScript APIs - no custom cancellation logic needed
Cancellation Result Properties
When execution is cancelled, the result includes detailed cancellation information:
interface AIPromptRunResult {
success: boolean;
cancelled?: boolean; // True if execution was cancelled
cancellationReason?: CancellationReason; // Why it was cancelled
status?: ExecutionStatus; // Current execution status
// ... other properties
}
type CancellationReason = 'user_requested' | 'timeout' | 'error' | 'resource_limit';
type ExecutionStatus = 'pending' | 'running' | 'completed' | 'failed' | 'cancelled';Progress Updates & Streaming
The AI Prompt Runner provides real-time progress updates and streaming support for long-running executions, enabling responsive user interfaces and monitoring dashboards.
Progress Callbacks
Track execution progress through different phases:
const runner = new AIPromptRunner();
const result = await runner.ExecutePrompt({
prompt: complexPrompt,
data: { document: longDocument },
contextUser: currentUser,
// Progress callback receives updates throughout execution
onProgress: (progress) => {
console.log(`${progress.step}: ${progress.percentage}% - ${progress.message}`);
// Update UI progress bar
updateProgressBar(progress.percentage);
updateStatusMessage(progress.message);
// Access additional metadata
if (progress.metadata) {
console.log('Execution metadata:', progress.metadata);
}
}
});Execution Progress Phases
The progress callback receives updates for these execution phases:
type ProgressPhase =
| 'template_rendering' // Rendering prompt template with data
| 'model_selection' // Selecting appropriate AI model
| 'execution' // Executing AI model
| 'validation' // Validating and parsing results
| 'parallel_coordination' // Coordinating parallel executions
| 'result_selection'; // AI judge selecting best result
// Example progress updates:
// template_rendering: 20% - "Rendering prompt template with provided data"
// model_selection: 40% - "Selected GPT-4 model based on prompt configuration"
// execution: 60% - "Executing AI model..."
// validation: 80% - "Validating output against expected format"
// result_selection: 90% - "AI judge selecting best result from 3 candidates"Streaming Response Support
Receive real-time content updates as AI models generate responses:
const result = await runner.ExecutePrompt({
prompt: streamingPrompt,
data: { query: 'Generate a detailed report...' },
contextUser: currentUser,
// Streaming callback receives content chunks as they arrive
onStreaming: (chunk) => {
if (chunk.isComplete) {
console.log('Streaming complete');
finalizeDocument();
} else {
// Append content chunk to UI
appendToDocument(chunk.content);
// Show which model is generating content (for parallel execution)
if (chunk.modelName) {
showActiveModel(chunk.modelName);
}
}
}
});Progress Updates in Parallel Execution
Progress tracking works seamlessly with parallel execution:
const result = await runner.ExecutePrompt({
prompt: par