cmte
v1.1.4
Published
Design by Committee™ except it's just you and LLMs
Maintainers
Readme
Committee
Overview
This framework enables users to assemble surgical context and create iterative prompts using templates to create chained LLM workflows.
Core Example: Service Analysis
Let's illustrate the core workflow with an example designed to analyze different microservices based on their specific documentation and source code.
1. workflow.yaml:
Defines file collections, global context, and the structured services object intended for iteration.
name: "service-analysis-workflow"
description: "Analyze multiple services using their specific docs and code"
outputPath: "_output/service-analysis"
# Define file collections
files:
# General Docs
architectureDoc: "docs/ARCHITECTURE.md"
# Auth Service Files
authConfigDoc: "docs/AUTH-CONFIG.md"
authCode: ["src/auth/**/*.js", "!src/auth/legacy/**"]
# Data Service Files
dataModelsDoc: "docs/DATA-MODELS.md"
dataCode: "src/data/**/*.js"
# Define universally accessible global variables
global_variables:
# General context available to all tasks
overallArchitecture: "{{ files.architectureDoc }}"
# Define data structures for set iteration
iterable_objects:
# Structured object containing service-specific context
services: # Target for 'for_each: services' in a set
auth: # Key becomes 'item.key' during iteration
# Value becomes 'item.value'
description: "Authentication and Authorization Service"
contact: "[email protected]"
# Embed CONTENT of auth-specific files
configDocContent: "{{ files.authConfigDoc }}"
codeContent: "{{ files.authCode }}"
data: # Key becomes 'item.key'
# Value becomes 'item.value'
description: "Data Processing and Storage Service"
contact: "[email protected]"
# Embed CONTENT of data-specific files
modelsDocContent: "{{ files.dataModelsDoc }}"
codeContent: "{{ files.dataCode }}"
# Define the sequence of sets
sets:
- useSet: analyze-service # Iterate over 'services' defined in iterable_objects
for_each: services(Note: The {{ files.collectionName }} syntax within global_variables or iterable_objects embeds the formatted content of the files.)
2. sets/analyze-service.set.yaml:
Defines a set that iterates over the services object defined in the workflow's iterable_objects.
name: "analyze-service"
description: "Run analysis tasks for each service defined in the context"
# Iterates over the 'services' object from workflow.yaml's iterable_objects
# Each item will be { key: serviceName, value: serviceObject }
for_each: services
tasks:
# These tasks run in parallel for each service
- useTask: identify-service-patterns
# Task context automatically includes 'item', 'item.key', 'item.value'
# and variables from 'global_variables' like 'overallArchitecture'
- useTask: suggest-service-improvements3. tasks/analyze-service.md:
A task template showing how to access the context provided by the iteration and global variables.
Analyze the service: **{{ item.key }}**
**Service Description:** {{ item.value.description }}
**Contact:** {{ item.value.contact }}
**Overall Architecture Context:**
```
{{ overallArchitecture }} # Accessing a global_variable
```
**Service-Specific Configuration Documentation:**
```
{{ item.value.configDocContent }} # Accessing data from item.value
```
**Service-Specific Code:**
```
{{ item.value.codeContent }} # Accessing data from item.value
```
**Analysis Request:**
Based on the overall architecture and the specific documentation and code for the `{{ item.key }}` service, please perform the analysis requested by the calling task (e.g., identify patterns, suggest improvements).
This example demonstrates how to:
- Define multiple file sources.
- Define
global_variablesaccessible everywhere. - Structure data for iteration under
iterable_objects. - Iterate over this structured data using
for_each. - Access the iteration key (
item.key), iteration value (item.value.*), and global variables within a task template.
Key Concepts
Now let's dive deeper into the core components illustrated above.
Workflows
A workflow is the top-level container defined in workflow.yaml, as seen in the Core Example. It specifies:
global_variablesaccessible throughout the workflow. These form the base context.iterable_objectsdefining data structures (arrays/objects) intended for set iteration viafor_each.- Named file collections (
files:) to gather context using glob patterns. File content is typically embedded intoglobal_variablesoriterable_objects. - An ordered sequence of
setsto be executed.
Sets
Sets group related tasks. Sets defined in the workflow's sets: list are executed sequentially in the order they appear.
Within a single set, the tasks listed are executed in parallel. Sets can optionally iterate over arrays or objects defined in the workflow's iterable_objects using for_each.
Tasks
Tasks are templated prompts (stored as .md files) that perform a specific action using an LLM, like the analyze-service.md template in the Core Example. Each task runs with a context including:
global_variables(fromworkflow.yaml).- Iteration variables (
item,item.key,item.valueif the set usesfor_each). - Outputs from tasks in previous sets, accessed via
prior_outputsdefined in the set file.
Important: Due to parallel execution within a set, a task cannot access the output of another task running in the same set. Input/output dependencies must be managed by sequencing tasks across different sets.
Escaping Template Syntax: If you need to include literal {{ or }} characters in your template without them being interpreted as variables, you can escape them with a backslash: \{{ will render as {{, and \}} will render as }}.
Referencing Output from Iterated Sets
When dealing with outputs from previous iterated sets, there are two main scenarios:
Accessing Corresponding Iteration Output: When both the previous set (e.g.,
set1) and the current set (e.g.,set2) iterate over the samefor_eachtarget, you often need to access the output from the previous set's task corresponding to the current item being processed in the current set.- Syntax:
setName.taskName[this].output - Use Case: An iterated set needs the specific output from the same iteration of a previous iterated set.
- Result: Resolves to the single output value for the current iteration.
Example (
set2iterated, needs corresponding output from iteratedset1):# In sets/set2.set.yaml (for_each: services) prior_outputs: # Get the analyze-service output for the current service analysis_result: "{{ set1.analyze-service[this].output }}"- Syntax:
Collecting All Iteration Outputs: When a subsequent set (often a non-iterated set, e.g.,
setB) needs to gather all the individual outputs generated by a task within a previous iterated set (e.g.,setA).- Syntax:
setName.taskName[*].output - Use Case: A later set needs to aggregate or process the results from all iterations of a previous iterated task.
- Result: Resolves to an array containing all the output values generated across all iterations of the specified task.
Example (
setBnon-iterated, needs all outputs from iteratedsetA):# In sets/setB.set.yaml (NOT iterated) prior_outputs: # Gather all results from setA's analyze-item task into an array all_analysis_results: "{{ setA.analyze-item[*].output }}"(Note: The task template using
{{ all_analysis_results }}will receive these outputs as a newline-separated string by default. Handle accordingly in your prompt.)- Syntax:
Referencing Output from Non-Iterated Sets
If the previous set was not iterated, you simply reference its task output directly:
setName.taskName.output: Output from a non-iterated task in a previous set. ThetaskNameused here must match theuseTaskvalue from the task definition in the previous set's YAML file.
Example (set2 non-iterated, needs output from non-iterated set1):
# In sets/set2.set.yaml (NOT iterated)
prior_outputs:
taskA_result: "{{ set1.taskA.output }}"Why other syntaxes fail:
"{{ set1.analyze-service.output }}": Refers to the entire array of outputs from the iterated task, not the specific one needed."{{ set1.analyze-service[item.key].output }}": Theprior_outputsresolver doesn't evaluate{{item.key}}within the reference string; it looks for a literal keyitem.key.
Important Convention: Task outputs are always stored and referenced using the exact name specified in the useTask field. There is no option to rename outputs.
Example Set Configuration (*.set.yml):
If set1 (non-iterated) contains a task useTask: taskA, and set2 (non-iterated) needs its output:
name: set2
tasks:
- useTask: process-output
prior_outputs:
# Map the reference to a local variable name for use in the task template
taskA_result: "{{ set1.taskA.output }}" # Reference uses the original task name 'taskA'Example Task Template (tasks/process-output.md):
Processing output for file {{ item.path }}.
Result from Task A in Set 1:
{{ taskA_result }} # Access the output via the name defined in prior_outputsNote: Referencing outputs from tasks within the same parallel set execution is unreliable and should be avoided. Structure your workflow with sequential sets for dependencies.
Where Data Comes From: Defining Your Context
Understanding where different types of data are defined and accessed is important for using Committee. The framework uses the following structure:
- Global Variables: Defined in the top-level
global_variables:block of yourworkflow.yaml. These are accessible to all sets and tasks throughout the workflow execution. - File Collections & Content: File sources are defined in the
files:block ofworkflow.yaml. To make file content available for LLM analysis, embed it into variables within theworkflow.yamlglobal_variables:oriterable_objects:blocks using{{ files.collectionName }}. Task templates (.md) can reference{{ files.collectionName }}to get a list of paths. - Iteration Data (
item): Data structures (arrays or objects) intended for iteration usingfor_eachare defined in theiterable_objects:block ofworkflow.yaml. Thefor_each: objectNamedirective within a*.set.ymlfile targets one of these workflow iterable objects. Tasks within that set then access the current iteration's data via theitemobject (oritem.key/item.valuefor object iteration). - Task Outputs (via Prior Outputs): Outputs from previous tasks are made available to a subsequent task via the
prior_outputs:block defined under that task in its*.set.ymlfile. This block maps a local name (used in the task template) to the structured output reference string (e.g.,setName.taskName[iterationKey].output).
Essentially, workflow.yaml is the primary location for defining the initial context (global_variables), data sources (files), and data for iteration (iterable_objects), while *.set.yml files orchestrate the execution flow and manage dependencies on previously generated task outputs via prior_outputs.
File Collection Handling
You define named file collections in workflow.yaml using file paths or glob patterns (include/exclude):
# workflow.yaml
name: "code-review-workflow"
files:
sourceCode:
include: ["src/**/*.js"]
exclude: ["src/vendor/**"]
testFiles: "test/**/*.test.js"
docs: ["README.md", "CONTRIBUTING.md"]
# ... global_variables, iterable_objects, and sets follow ...These collections are primarily used to inject context into your workflow. The way you reference a collection using {{ files.collectionName }} has two behaviors depending on where it is used:
In
workflow.yaml(global_variables:oriterable_objects:):- Behavior: Embeds the full content of each file within the collection directly into the variable's string value. Each file's content is automatically prefixed with a Markdown header indicating its path (e.g.,
# path/to/file.js). - Purpose: This is the primary mechanism for injecting substantial file content (like source code, documentation) into the context, making it available to subsequent sets and tasks for direct LLM analysis.
- Example (
workflow.yaml):global_variables: # Embeds the content of all files matching src/**/*.js, # each block prefixed with '# filepath' sourceContext: "{{ files.sourceCode }}" # Embeds content of README.md and CONTRIBUTING.md docsContext: "{{ files.docs }}"
- Behavior: Embeds the full content of each file within the collection directly into the variable's string value. Each file's content is automatically prefixed with a Markdown header indicating its path (e.g.,
In Task Templates (
*.mdfiles):Behavior: Renders a newline-separated list of the file paths belonging to that collection. It does not embed the file content here.
Purpose: Useful for providing informational context within a task prompt, such as listing related files for the LLM's reference, without including their potentially large content directly in that specific prompt.
Example (
tasks/review-code.md):Review the following source code file `{{ item.path }}`: ```javascript {{ item.content }} # Assuming iteration over a file collectionConsider related test files (paths listed below): {{ files.testFiles }} # Lists paths from the 'testFiles' collection
Key Distinction: Use {{ files.collectionName }} in workflow.yaml (global_variables or iterable_objects) to provide the content needed for LLM analysis. Use it in task templates (.md) when you only need to reference the paths of the files.
(Note: Advanced pattern filtering within the template tag like {{ files.collectionName:*.js }} is not currently implemented.)
Two-Phase Thinking
Tasks can optionally perform a preliminary "thinking" step before generating the final response. This is useful for complex analysis or reasoning tasks. Configure this using YAML frontmatter at the top of your task's .md file:
---
name: "complex-analysis-task" # Optional: Task name for clarity
thinking: true # REQUIRED: Enables the thinking phase
thinking_prompt: "path/to/thinking-prompt.md" # Optional: Use a separate prompt file for the thinking phase
thinking_instruction: "Analyze the input step-by-step..." # Optional: Specific instruction for the thinking phase
thinking_params:
temperature: 0.2 # Optional: LLM parameters specifically for the thinking phase
---
# Main Task Prompt
Based on the preceding analysis, provide the final answer.
Context:
{{ context }}- If
thinking: true, the framework first runs the thinking phase (using the main prompt orthinking_promptif provided, potentially guided bythinking_instruction). - The output of the thinking phase is then automatically prepended to the context provided to the main task prompt for generating the final response.
- You can control LLM parameters specifically for the thinking step using
thinking_params.
Using the Framework
Installation
# Navigate to the project root directory
# Install globally (recommended for CLI use)
npm install -g .
# Or install locally
npm install . Basic Usage
- Create a workflow directory (e.g.,
my-workflow/) containing:workflow.yaml(workflow definition)sets/directory (with.set.yamlor.set.ymlset definitions)tasks/directory (with.mdtask prompt files)
- Configure your environment variables (e.g., in a
.envfile in your project or system):# Required for using Anthropic API (if not using --local) ANTHROPIC_API_KEY=your_api_key_here # Optional: Specify default model (defaults exist, e.g., Claude 3 Haiku for --lite, Sonnet otherwise) # DEFAULT_MODEL=claude-3-sonnet-20240229 # Optional: Set maximum tokens for LLM responses (default: 10000) # MAX_TOKENS=100000 # Optional: Set maximum number of concurrent API requests (default: 10) # MAX_PARALLEL_REQUESTS=15 # Optional: Set minimum delay between starting parallel API requests (in seconds, default: 0.1) # Useful for proactively avoiding rate limits based on request frequency. # REQUEST_DELAY_SECONDS=0.5 # Optional: Set maximum number of retries for failed API calls (default: 20) # LLM_MAX_RETRIES=10 # Optional: For using a local LLM (requires --local flag) # Needs a running server compatible with OpenAI API spec (e.g., Ollama, LM Studio) LOCAL_LLM_URL=http://localhost:11434 # Default Ollama URL example # Optional: Specify model served by local URL (required if server hosts multiple) # LOCAL_LLM_MODEL=llama3 - Run the workflow from your terminal:
