prompt-version-manager
v0.1.4
Published
Centralized prompt management system for Human Behavior AI agents
Maintainers
Readme
PVM - Prompt Version Management
Git-like version control for AI prompts with automatic execution tracking
Features • Quick Start • Simplified API • Documentation • Examples
Overview
PVM (Prompt Version Management) brings Git-like version control to AI prompt development with automatic execution tracking. The new simplified API makes it incredibly easy to integrate prompt management into any AI workflow.
Quick Start with Simplified API
Installation
pip install prompt-version-management
# or
npm install prompt-version-managementBasic Usage in 3 Lines
from pvm import prompts, model
# Load and render a prompt template
prompt = prompts("assistant.md", ["expert", "analyze this data"])
# Execute with automatic tracking
response = await model.complete("gpt-4", prompt, "analysis-task")
print(response.content)That's it! Every execution is automatically tracked with timing, tokens, costs, and model details.
Simplified API Reference
prompts() - Load and Render Templates
Load prompt templates with variable substitution:
# Python
prompt = prompts("template_name.md", ["var1", "var2", "var3"])
# TypeScript
const prompt = prompts("template_name.md", ["var1", "var2", "var3"])How it works:
- Templates are stored in
.pvm/prompts/directory - Variables are replaced in order of appearance
- Supports YAML frontmatter for metadata
Example template (.pvm/prompts/assistant.md):
---
version: 1.0.0
description: General assistant prompt
---
You are a {{role}} assistant specialized in {{domain}}.
Task: {{task}}
Please be {{tone}} in your response.Usage:
prompt = prompts("assistant.md", [
"helpful AI", # {{role}}
"data analysis", # {{domain}}
"analyze sales", # {{task}}
"concise" # {{tone}}
])model.complete() - Execute with Auto-Tracking
Execute any LLM with automatic tracking:
# Python
response = await model.complete(
model_name="gpt-4", # Model to use
prompt="Analyze this text", # Prompt or previous response
tag="analysis", # Tag for tracking
json_output=MyModel, # Optional: structured output
temperature=0.7 # Optional: model parameters
)
# TypeScript
const response = await model.complete(
"gpt-4", // Model to use
"Analyze this text", // Prompt or previous response
"analysis", // Tag for tracking
{ schema: mySchema }, // Optional: structured output
{ temperature: 0.7 } // Optional: model parameters
)Automatic features:
- Provider detection (OpenAI, Anthropic, Google)
- Execution tracking (tokens, latency, cost)
- SHA-256 prompt hashing
- Git-like auto-commits
- Chain detection
Automatic Chaining
Pass a response directly to create execution chains:
# First agent
analysis = await model.complete("gpt-4", prompt1, "analyze")
# Second agent - automatically chains!
summary = await model.complete("claude-3", analysis, "summarize")
# Third agent - chain continues
translation = await model.complete("gemini-pro", summary, "translate")PVM automatically:
- Links executions with a chain ID
- Tracks parent-child relationships
- Preserves context across models
Native Structured Output
Get typed responses using each provider's native capabilities:
from pydantic import BaseModel
class Analysis(BaseModel):
sentiment: str
confidence: float
keywords: list[str]
# Automatic structured output
result = await model.complete(
"gpt-4",
"Analyze: PVM is amazing!",
"sentiment",
json_output=Analysis # Pass Pydantic model
)
print(result.content.sentiment) # "positive"
print(result.content.confidence) # 0.95
print(result.content.keywords) # ["PVM", "amazing"]Provider-specific implementations:
- OpenAI: Uses
response_formatwith Pydantic models - Anthropic: Uses tool/function calling
- Google: Uses
response_schemawith Type constants
ModelResponse Object
Every execution returns a ModelResponse with:
response.content # The actual response (string or object)
response.model # Model used
response.execution_id # Unique execution ID
response.prompt_hash # SHA-256 of prompt
response.metadata # Execution details:
- tag # Your tracking tag
- provider # Detected provider
- chain_id # Chain ID (if chained)
- tokens # Token usage breakdown
- latency_ms # Execution time
- structured # Whether structured output was used
# Methods
str(response) # Convert to string
response.to_prompt() # Format for chainingFull Version Control Features
Initialize Repository
pvm initAdd and Track Prompts
# Add a prompt file
pvm add prompts/assistant.md -m "Add assistant prompt"
# Track executions automatically
python my_agent.py # Uses simplified API
# View execution history
pvm log
# See execution dashboard
pvm dashboardBranching and Experimentation
# Create experiment branch
pvm branch experiment/new-approach
pvm checkout experiment/new-approach
# Make changes and test
# ... edit prompts ...
# Merge back when ready
pvm checkout main
pvm merge experiment/new-approachAnalytics and Insights
# View execution analytics
pvm analytics summary
# See token usage
pvm analytics tokens --days 7
# Track costs
pvm analytics cost --group-by modelRepository Structure
.pvm/
├── prompts/ # Prompt templates
│ ├── assistant.md
│ ├── analyzer.md
│ └── summarizer.md
├── config.json # Repository configuration
├── HEAD # Current branch reference
└── objects/ # Content-addressed storage
├── commits/ # Commit objects
├── prompts/ # Prompt versions
└── executions/ # Execution recordsAdvanced Features
Templates with Complex Variables
# Create a template with sections
template = """
You are a {{role}}.
## Instructions
{{instructions}}
## Context
{{context}}
## Output Format
{{format}}
"""
# Use with prompts()
prompt = prompts("complex_template.md", [
"senior data analyst",
"Analyze trends and patterns",
"Q4 sales data from 2023",
"Bullet points with insights"
])Multi-Provider Execution
providers = ["gpt-4", "claude-3-opus", "gemini-ultra"]
# Run same prompt across providers
for model_name in providers:
response = await model.complete(
model_name,
prompt,
f"comparison-{model_name}"
)
print(f"{model_name}: {response.content}")Custom Tracking Metadata
response = await model.complete(
"gpt-4",
prompt,
"analysis",
temperature=0.3,
max_tokens=1000,
# Custom parameters stored in metadata
user_id="12345",
session_id="abc-def",
experiment="v2"
)Working with Branches
from pvm import Repository
repo = Repository(".")
# Create and switch branch
await repo.branch("feature/new-prompts")
await repo.checkout("feature/new-prompts")
# Make changes
prompt = prompts("new_assistant.md", ["expert", "task"])
response = await model.complete("gpt-4", prompt, "test")
# Commit and merge
await repo.commit("Test new prompts")
await repo.checkout("main")
await repo.merge("feature/new-prompts")Environment Setup
Set your API keys:
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="..."Or use .env file:
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...Execution Analytics
View detailed analytics:
# Summary statistics
pvm analytics summary
# Token usage over time
pvm analytics tokens --days 30 --group-by model
# Cost analysis
pvm analytics cost --group-by tag
# Chain visualization
pvm analytics chains --limit 10Testing
Run tests for both implementations:
# Python tests
pytest tests/
# TypeScript tests
npm testExamples
Two-Agent Analysis Pipeline
from pvm import prompts, model
from pydantic import BaseModel
# Define output structures
class SentimentResult(BaseModel):
sentiment: str
confidence: float
class Summary(BaseModel):
title: str
key_points: list[str]
# Agent 1: Sentiment Analysis
sentiment_prompt = prompts("sentiment.md", ["expert analyst", text])
sentiment = await model.complete(
"gpt-4",
sentiment_prompt,
"sentiment-analysis",
json_output=SentimentResult
)
# Agent 2: Summarization (chained)
summary = await model.complete(
"claude-3",
sentiment, # Chains automatically!
"summarization",
json_output=Summary
)
print(f"Sentiment: {sentiment.content.sentiment}")
print(f"Summary: {summary.content.title}")A/B Testing Prompts
# Version A
prompt_a = prompts("assistant_v1.md", variables)
response_a = await model.complete("gpt-4", prompt_a, "test-v1")
# Version B
prompt_b = prompts("assistant_v2.md", variables)
response_b = await model.complete("gpt-4", prompt_b, "test-v2")
# Compare results
print(f"Version A tokens: {response_a.metadata['tokens']['total']}")
print(f"Version B tokens: {response_b.metadata['tokens']['total']}")Contributing
We welcome contributions! Please see our Contributing Guide for details.
License
MIT License - see LICENSE for details.
