@oroboreo/cli

v1.1.0

Published

11 hours ago

The Golden Loop - Self-improving, cost-optimized autonomous development engine

0High
0Medium
0Low

ai claude autonomous development cost-optimization automation agent bedrock anthropic meta-development self-improving task-automation aws code-generation

Oroboreo - The Golden Loop

The self-improving, cost-optimized autonomous development engine that gets better with every iteration.

Oroboreo is a meta-development system that combines Claude Code, AWS Bedrock, intelligent model routing, and self-learning capabilities into a single, autonomous workflow.

What is Oroboreo?

1. Autonomous & Self-Sufficient

✅ Plans its own work (Opus 4.6 PRD generation)
✅ Executes with optimal cost/power ratio (Sonnet/Haiku routing)
✅ Commits progress automatically (Git integration)
✅ Learns from past sessions (Archive analysis)
✅ Self-improves over time (Feedback loop)

2. Cost-Optimized & Flexible

Supports AWS Bedrock OR Anthropic API (Claude Code subscription)
Smart model routing: Simple tasks → Haiku ($1/$5), Complex → Sonnet ($3/$15)
Real cost tracking with CloudWatch integration (Bedrock only)
Typical feature: $1-3 vs $5-10 with competitors
Easy provider switching with single env variable

3. Production-Ready

Full tool execution via Claude Code
Smart context management (no memory loss)
Retry logic with exponential backoff
Archive with year/month structure (archives/YYYY/MM/)
Git branching and auto-commits
Post-mortem diagnostics for hung tasks

4. Extensible & Template-Based

Drop into any project in minutes
Intelligent initialization (discovers your codebase)
Language-agnostic (works with React, Node, Python, etc.)
Customizable Universal Laws (your project's rules)

Why Oroboreo?

A Tool Built from Real Experience

I've spent 10 years building software across med-tech, logistics, research labs, and startups, from taking a no-code platform (before AI was a thing) from beta to $2M ARR as its first engineer, to architecting cloud infrastructure and deploying AI systems on embedded devices. I've seen what works at scale.

As AI rapidly evolved, I started building tools like this for my own projects and systems that don't just suggest code, but actually complete features autonomously and cost-effectively. Oroboreo is what emerged from that process.

I know there are many similar tools out there. But this is what helps me ship faster without breaking the bank or losing context across sessions, and if it helps you too, that's why I'm sharing it.

This isn't about replacing developers. It's about giving devs & builders leverage to focus on architecture and decision-making while the tedious implementation loop runs itself.

🎯 The Five-Layer System

┌─────────────────────────────────────────┐
│ Layer 1: PLANNING (Opus 4.6)            │
│ ├─ Generate comprehensive PRDs          │
│ ├─ Tag task complexity [SIMPLE/COMPLEX] │
│ └─ Cost: $0.30-2 per PRD (one-time)     │
└─────────────────────────────────────────┘
           ↓
┌─────────────────────────────────────────┐
│ Layer 2: ROUTING (Smart Selection)      │
│ ├─ Parse PRD tasks                      │
│ ├─ Select model based on complexity     │
│ └─ Optimize cost vs. quality            │
└─────────────────────────────────────────┘
           ↓
┌─────────────────────────────────────────┐
│ Layer 3: EXECUTION (Claude Code)        │
│ ├─ Full tool access (Read/Write/Edit)   │
│ ├─ Runs smartly (70% cost savings)      │
│ └─ Smart context  (no amnesia)          │
└─────────────────────────────────────────┘
           ↓
┌─────────────────────────────────────────┐
│ Layer 4: MEMORY (Persistent Learning)   │
│ ├─ Archives completed sessions          │
│ ├─ Extracts patterns and insights       │
│ └─ Updates "AGENTS.md" automatically    │
└─────────────────────────────────────────┘
           ↓
┌─────────────────────────────────────────┐
│ Layer 5: FEEDBACK (Self-Improvement)    │
│ ├─ Analyzes historical performance      │
│ ├─ Recommends optimizations             │
│ └─ Improves future PRDs                 │
└─────────────────────────────────────────┘

💰 Cost Optimization

Oroboreo is designed for cost-effective autonomous development:

No subscription fees - Pay only for AI compute you use
Smart model routing - Uses Opus ($5/$25 per 1M tokens) for high-level planning, Sonnet ($3/$15) for complex implementation, Haiku ($1/$5) for simple tasks
Multiple provider options - AWS Bedrock, Anthropic API, or (soon) Google Vertex AI
Real-time cost tracking - Monitor spending per task in costs.json
Full autonomy - Complete task loops with auto-retry, not just code suggestions
Self-improving - Archives + feedback loops make it better with every session

Typical costs: $1-3 for a 12-task feature implementation

Example: User authentication feature with 12 tasks (4 simple + 8 complex) using smart routing typically costs $1-2 in API usage.

📋 Prerequisites

Node.js 18+ installed
Claude Code CLI: npm install -g @anthropic-ai/claude-code
GitHub CLI (optional): Required for auto PR creation
- Install: https://cli.github.com/
- Without it, you'll see: ⚠️ GitHub CLI (gh) not installed. Skipping PR creation.

🚀 Installation

Option 1: NPM (Recommended)

# Install globally
npm install -g @oroboreo/cli

# Verify installation
oro-init --help

Option 2: Clone Repository

# Clone and copy to your project
git clone https://github.com/chemnm/oroboreo.git
cp -r oroboreo/ your-project/oroboreo/

🍪 The Core Commands

🔧 Setup (One-Time)

# Navigate to your project
cd your-project

# Initialize Oroboreo (AI-powered or manual)
oro-init
# - Discovers your project structure
# - Creates creme-filling.md with Universal Laws
# - Sets up .env for AWS Bedrock (optional AI analysis: ~$0.10-0.30)

📝 Create Tasks

Option A: Generate NEW Feature Tasks

# Use Opus 4.6 to generate comprehensive task breakdown
oro-generate "Add user authentication with JWT"

# OR: Create new-prompt.md with detailed feature description, then:
oro-generate
# Script will ask if you want to use new-prompt.md
# File is archived and cleared after generation

Opus creates detailed tasks in cookie-crumbs.md
Tags tasks as [SIMPLE] or [COMPLEX]
Supports inline, interactive, or file-based input
Cost: ~$0.30-2.00 per PRD depending on the complexity of your project

Option B: Generate FIX Tasks from Feedback

# 1. Write issues you found during testing in human-feedback.md
# 2. Run the feedback architect
oro-feedback

Opus analyzes your feedback + latest archive
Creates fix tasks in cookie-crumbs.md
Cost: ~$0.15-0.40 per feedback session

Option C: Manual Tasks

Edit cookie-crumbs.md directly with your own tasks

▶️ Execute Tasks

# Run the Golden Loop
oro-run

Auto-loops through all tasks in cookie-crumbs.md
Smart model selection (Haiku for [SIMPLE], Sonnet for [COMPLEX])
Cost tracking in costs.json
Git commits on task completion
Cost: ~$1-3 per 12-task feature

📊 View Costs

# Export cost data to CSV or compare with CloudWatch
oro-costs

🔍 Diagnose Issues

# Post-mortem analysis for hung/failed tasks
oro-diagnose

Analyzes execution logs from archived sessions
Identifies timeout patterns and error causes
Shows task duration, output silence periods, and failure reasons
Helps debug overnight hangs or unexpected failures

🎬 Quick Start

NPM Install (Recommended): QUICKSTART.md
Manual Clone: QUICKSTART-CLONE.md

🏗️ Architecture

File Structure

your-project/
├── oroboreo/                    # All Oroboreofiles live here
│   ├── cookie-crumbs.md          # Task list (THE PLAN)
│   ├── creme-filling.md          # System rules (THE LAW)
│   ├── progress.txt              # Session memory (THE LEARNINGS)
│   ├── human-feedback.md         # Your feedback input
│   ├── costs.json        # Cost tracking
│   ├── .env                      # AWS credentials (from .env.example)
│   ├── .env.example              # Template for credentials
│   ├── tests/                    # Verification scripts 
│   │   ├── README.md             # Explains test organization
│   │   ├── reusable/             # Generic tests kept across sessions
│   │   │   ├── README.md         # What makes a test reusable
│   │   │   ├── verify-auth.js    # Example: Generic auth check
│   │   │   └── check-api.js      # Example: API health check
│   │   └── verify-task-*.js      # Session-specific tests (archived after)
│   ├── utils/
│   │   ├── oreo-config.js        # Shared configuration (SINGLE SOURCE OF TRUTH)
│   │   ├── oreo-init.js          # Project initialization (SETUP)
│   │   ├── oreo-run.js           # Main execution loop (THE ENGINE)
│   │   ├── oreo-generate.js      # NEW feature task generator (PLANNER)
│   │   ├── oreo-feedback.js      # FIX task generator (ARCHITECT)
│   │   ├── oreo-archive.js       # Session archival with smart test sorting (HISTORIAN)
│   │   ├── oreo-costs.js         # Cost analysis & export (ACCOUNTANT)
│   │   ├── oreo-diagnose.js      # Post-mortem analysis for hung tasks (DEBUGGER)
│   │   ├── install.js            # Installation script
│   │   ├── run-with-prompt.bat   # Execute Claude Code
│   │   └── run-with-bedrock.bat  # Execute with Bedrock config
│   └── archives/                 # Historical sessions (year/month organized)
│       └── 2026/                 # Year folder
│           └── 01/               # Month folder
│               └── feature-name-2026-01-20-14-30/  # Session archive
│                   ├── cookie-crumbs.md
│                   ├── progress.txt
│                   ├── costs.json
│                   ├── oreo-execution.log  # Full execution log
│                   └── tests/              # Session-specific tests only
│                       └── verify-task-*.js
├── src/                          # Your project source
└── ...

File Reference

| File | Purpose | |------|---------| | utils/oreo-config.js | Shared configuration - model IDs, costs, paths (SINGLE SOURCE OF TRUTH) | | utils/oreo-init.js | Initialize Oroboreoin a new project (AI-powered or manual) | | utils/oreo-run.js | Main loop - executes tasks from cookie-crumbs.md | | utils/oreo-generate.js | Generate tasks for NEW features (uses Opus 4.6) | | utils/oreo-feedback.js | Generate FIX tasks from human feedback (uses Opus 4.6) | | utils/oreo-archive.js | Archive completed sessions with year/month structure (HISTORIAN) | | utils/oreo-costs.js | Export costs to CSV or compare with CloudWatch (ACCOUNTANT) | | utils/oreo-diagnose.js | Post-mortem analysis for hung/failed tasks (DEBUGGER) | | cookie-crumbs.md | Task list with checkboxes (like PRD.md) | | creme-filling.md | System rules injected into every agent (like AGENTS.md) | | progress.txt | Shared memory between agent instances | | human-feedback.md | Where you describe issues for the feedback architect | | costs.json | Real-time cost tracking per task | | tests/ | Session-specific verification scripts (archived after session) | | tests/reusable/ | Generic verification scripts (persist across sessions) |

Workflow

Initialize - Run oro-init to set up creme-filling.md (AI-powered or manual)
Plan Tasks - Choose your approach:
- NEW Feature: oro-generate "Add user authentication" (Opus generates fresh tasks)
- FIX Issues: Write issues in human-feedback.md → run oro-feedback (Opus analyzes archives + creates fix tasks)
- Manual: Write tasks directly in cookie-crumbs.md
Execute - oro-run loops through tasks:
- Parses next incomplete task - [ ]
- Selects model based on [SIMPLE]/[COMPLEX] tags (Haiku/Sonnet)
- Spawns Claude Code with Bedrock
- Tracks cost in costs.json
- Logs execution to oreo-execution.log
- Commits on completion
- Marks task - [x]
- 30-minute timeout with heartbeat logging
Archive - Completed sessions preserved in archives/YYYY/MM/session-name-timestamp/ for learning
Diagnose - If tasks hang/fail, run oro-diagnose on archived session for analysis

🔐 AI Provider Setup

Choose the one that fits your needs:

Option 1: AWS Bedrock (Better rate limit for Enterprise)

📚 Official Guide: See the Claude Code Bedrock Documentation for detailed setup instructions.

Prerequisites:

AWS Account with Bedrock access (us-east-1 region recommended)
IAM user with bedrock:InvokeModel permission
Claude models enabled (auto-enabled in most regions)

Configuration:

# Copy the example and fill in your credentials
cd oroboreo
cp .env.example .env

# Edit .env:
AI_PROVIDER=bedrock
AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1

Option 2: Microsoft Foundry (Use Your Azure Account)

📚 Official Guide: See Claude in Microsoft Foundry for detailed setup.

Prerequisites:

Active Azure subscription
Azure AI Foundry resource created
Claude model deployment (Opus, Sonnet, or Haiku)

Configuration:

# Copy the example and fill in your credentials
cd oroboreo
cp .env.example .env

# Edit .env:
AI_PROVIDER=foundry
ANTHROPIC_FOUNDRY_API_KEY=your-azure-api-key
ANTHROPIC_FOUNDRY_RESOURCE=your-foundry-resource-name

Setup Steps:

Go to Azure AI Foundry
Create or select a Foundry resource
Navigate to Models + endpoints → Deploy model → Deploy base model
Search for and deploy a Claude model (e.g., claude-sonnet-4-5)
Copy the API key from Keys and Endpoint section

Option 3: Anthropic API (Pay-As-You-Go)

Prerequisites:

Anthropic API account
API key from https://console.anthropic.com/

Configuration:

# Copy the example and fill in your credentials
cd oroboreo
cp .env.example .env

# Edit .env:
AI_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...

Option 4: Claude Code Subscription (Simplest Setup)

Prerequisites:

Claude Pro or Team subscription at https://claude.ai/

Configuration:

# One-time login
npx @anthropic-ai/claude-code login

# Edit .env:
AI_PROVIDER=subscription
# No API key needed!

Model Configuration (Automatic)

Oroboreo automatically uses the correct model IDs based on your provider:

| Model | Cost (per 1M tokens) | Used For | |-------|---------------------|----------| | Opus 4.6 | $5/$25 | Architect (PRD generation) | | Sonnet 4.5 | $3/$15 | Complex tasks [COMPLEX] | | Haiku 4.5 | $1/$5 | Simple tasks [SIMPLE] |

Note: Costs are identical across all providers (Bedrock, Foundry, Anthropic API).

🧠 The Creme Filling (System Rules)

Every project has core constraints that should never be violated. Define these in creme-filling.md:

Examples:

Never expose database credentials in frontend code
All API routes must use authentication middleware
Components must follow atomic design principles
Database queries must use parameterized statements

These rules are injected into every Claude Code instance, ensuring consistent behavior across all tasks.

🌐 Use Cases

1. Solo Developers

Build features 70% cheaper
Focus on architecture, let Oroboreohandle implementation
Learn from past sessions (what worked, what didn't)

2. Startups

Rapid prototyping with minimal AI costs
Consistent code quality (Universal Laws)
Built-in documentation (PRDs + archives)

3. Agencies

Template per client (drop in, customize, run)
Cost tracking for billing
Historical performance data

4. Open Source Projects

Community contributors can generate PRDs
Maintainers approve, Oroboreo executes
Transparent cost tracking

🔮 Roadmap & Planned Features

📦 Distribution & Packaging

NPM Package Distribution

[x] Publish to NPM as @oroboreo/cli
[x] Global installation: npm install -g @oroboreo/cli
[ ] NPX support: npx @oroboreo init, npx @oroboreo run
[ ] Auto-update notifications for new versions
[x] Semantic versioning and changelog automation

Why This Matters: Eliminate manual folder copying, make installation one command, standardize updates.

🧪 Autonomous Testing & Verification

Playwright Browser Automation

[x] Built-in Playwright support for autonomous UI testing
[x] Browser test utilities in tests/reusable/browser-utils.js
[x] Auto-detect Playwright need and suggest installation
[x] Console log capture and error detection
[x] Screenshot evidence collection
[ ] Visual regression testing (compare before/after screenshots)
[ ] Accessibility testing integration (automated a11y checks)
[ ] Video recording for debugging

Why This Matters: Eliminates manual UI testing by enabling Claude to verify its own changes. The agent can open browsers, click elements, test workflows, capture console errors, and collect evidence—all without human intervention. This closes the feedback loop and reduces dependency on human verification.

🔌 Integrations & Extensions

VS Code Extension

[ ] Right-click → "Run Oroboreo Task"
[ ] Task list management from sidebar
[ ] Real-time execution progress panel
[ ] Cost tracker widget in status bar
[ ] Archive browser for historical sessions

GitHub Actions Integration

[ ] Workflow: Comment /oroboreo <feature> on issues/PRs
[ ] Automated PR creation from completed sessions
[ ] Branch protection integration (require approval before merge)
[ ] Cost budgeting controls for CI/CD

MCP (Model Context Protocol) Server Integrations

[ ] Built-in MCP server management (install, configure, enable/disable)
[ ] Popular MCP server templates (filesystem, git, database, API tools)
[ ] Auto-discovery of project-relevant MCP servers (e.g., detect Postgres → suggest database MCP)
[ ] Session-scoped MCP server activation (enable specific tools per task)
[ ] Cost tracking for MCP tool usage

Why MCP Matters: Extends Claude's capabilities beyond code editing to databases, APIs, cloud resources, and custom tooling—all through Anthropic's standardized protocol.

🎨 UX & Developer Experience

Real-Time Improvements

[ ] Streaming output during agent execution (no more waiting for task completion)
[ ] Interactive prompts (agent asks clarifying questions mid-task)
[ ] Task pause/resume functionality
[ ] Parallel task execution (run multiple tasks concurrently)

Web UI Dashboard

[ ] Cost analytics with charts (daily/weekly/monthly spend)
[ ] Session explorer with search and filtering
[ ] Task template library (share common workflows)
[ ] Visual task breakdown editor (drag-and-drop PRD builder)

Testing & Quality Assurance

[ ] Auto-generate verification tests from task descriptions
[ ] Code review mode (post-session quality analysis)
[ ] Security scanning integration (detect common vulnerabilities)
[ ] Performance profiling for generated code

🌐 Claude Provider Expansion

Google Cloud Vertex AI Support

[ ] Vertex AI provider option (Claude via Google Cloud)
[ ] Unified credential management across AWS/GCP/Anthropic
[ ] Cost comparison dashboard across providers
[ ] Provider failover/redundancy (fallback to alternative if primary unavailable)

Why This Matters: Google Cloud users can access Claude through Vertex AI (via Anthropic partnership), providing alternative to AWS Bedrock or direct Anthropic API.

🏢 Enterprise & Team Features

Team Collaboration Mode

[ ] Shared task queues (assign tasks to team members)
[ ] Session replay (review how agent completed tasks)
[ ] Approval workflows (review PRDs before execution)
[ ] Cost allocation per team/project

Audit & Compliance

[ ] Comprehensive audit logs (who ran what, when, and cost)
[ ] SSO integration (Google, Okta, Azure AD)
[ ] Role-based access control (dev, reviewer, admin)
[ ] Compliance reports (SOC2, GDPR, HIPAA-friendly logging)

🚀 Multi-Language & Framework Support

Project Templates

[ ] React/Next.js starter template
[ ] Python/FastAPI template
[ ] Go microservices template
[ ] Rust CLI tool template
[ ] Community-contributed templates marketplace

🔭 Future Vision: Provider-Agnostic Architecture

⚠️ Architectural Rewrite Required

The current version is built on Claude Code, which is specifically designed for Anthropic's Claude models (Opus, Sonnet, Haiku). Supporting other model providers (OpenAI GPT, Google Gemini, local LLMs) would require:

Replace Claude Code with custom agent system
Abstract tool execution (file operations, bash, git) to work across models
Normalize model APIs (different providers use different formats)
Reimplement cost tracking (each provider has unique pricing/token counting)
Handle capability differences (not all models support extended thinking, tool use, etc.)

Planned (Post-1.0):

[ ] Build unified agent abstraction layer
[ ] Prototype with OpenAI + local LLM + Claude support
[ ] Evaluate tradeoffs (complexity vs. flexibility)
[ ] Community feedback on demand for multi-model support

Why Not Now? Maintaining quality for Claude models takes priority. Multi-model support would delay core features. I will revisit based on user demand.

🤝 Contributing

Obsess over building the future of autonomous development! Contributions welcome:

Fork the repo
Create feature branch (git checkout -b feature/amazing)
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing)
Open Pull Request

📜 License

MIT License - Use freely, commercially or personally.

🙏 Acknowledgments

Anthropic for Claude Code integration
AWS for bedrock and easy enterprise level model access
The Ouroboros (the eternal snake) for inspiration 🐍
The Ralph Wiggum Loop AI Loop Technique for Claude Code 🔁
Oreo Yum 🍪
Oro GOLD 🪙

📬 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Collaboration and Sponsorships: Email

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Oroboreo - The Golden Loop

What is Oroboreo?

1. Autonomous & Self-Sufficient

2. Cost-Optimized & Flexible

3. Production-Ready

4. Extensible & Template-Based

Why Oroboreo?

🎯 The Five-Layer System

💰 Cost Optimization

📋 Prerequisites

🚀 Installation

Option 1: NPM (Recommended)

Option 2: Clone Repository

🍪 The Core Commands

🔧 Setup (One-Time)

📝 Create Tasks

Option A: Generate NEW Feature Tasks

Option B: Generate FIX Tasks from Feedback

Option C: Manual Tasks

▶️ Execute Tasks

📊 View Costs

🔍 Diagnose Issues

🎬 Quick Start

🏗️ Architecture

File Structure

File Reference

Workflow

🔐 AI Provider Setup

Option 1: AWS Bedrock (Better rate limit for Enterprise)

Option 2: Microsoft Foundry (Use Your Azure Account)

Option 3: Anthropic API (Pay-As-You-Go)

Option 4: Claude Code Subscription (Simplest Setup)

Model Configuration (Automatic)

🧠 The Creme Filling (System Rules)

🌐 Use Cases

1. Solo Developers

2. Startups

3. Agencies

4. Open Source Projects

🔮 Roadmap & Planned Features

📦 Distribution & Packaging

🧪 Autonomous Testing & Verification

🔌 Integrations & Extensions

🎨 UX & Developer Experience

🌐 Claude Provider Expansion

🏢 Enterprise & Team Features

🚀 Multi-Language & Framework Support

🔭 Future Vision: Provider-Agnostic Architecture

🤝 Contributing

📜 License

🙏 Acknowledgments

📬 Support