ultraagent

v3.1.0

Published

2 months ago

Autonomous AI coding assistant with Superpowers — powered by local LLMs on Apple Silicon. 49 tools, diff preview, auto-commit, repo map, conversation forking, multimodal input, Docker sandbox, web browsing, smart model routing, background agents, cron sch

0High
0Medium
0Low

zurdai

ai coding-assistant local-llm lm-studio ollama apple-silicon cli agent langchain typescript tool-use code-generation

Getting Started · Commands · Tool System · Superpowers · Project Creation · Configuration

What is UltraAgent?

UltraAgent is a terminal-based AI coding assistant that operates autonomously using an integrated tool system. It reads, writes, and edits files, runs shell commands, searches codebases, manages git, spawns sub-agents, persists memory, and browses the web — all controlled by a configurable three-tier permission system.

Runs 100% local on your machine — no API keys, no cloud, no data leaves your computer. Supports vMLX, Ollama, LM Studio, MLX-LM, vLLM, and llama.cpp.

Highlights

100% local & private — no API keys, no cloud, all data stays on your machine
Auto-Setup — detects your hardware, backends, and recommends optimal model + settings
49 built-in tools — file I/O, shell, git, browser, sub-agents, memory, web, planning, review, verification, cron, notebooks
14 Superpowers skills — TDD, debugging, code review, worktrees, brainstorming, and more
Smart model routing — automatically selects the optimal model per task complexity
Quality gates — validates outputs, catches failure patterns, forces retries on bad responses
Reasoning engine — chain-of-thought scaffolding that improves local LLM reasoning by 20-40%
Code review — sub-agent-powered two-stage review (spec compliance + code quality)
Verification gates — evidence-based completion with automated test/build/lint checks
Background agents — non-blocking parallel agents that run in the background
Git worktrees — isolated workspaces for parallel development branches
Project creation wizard — scaffold production-ready projects from scratch
Project scanner — deep analysis auto-injected into every LLM request
Three-tier permissions — safe / confirm / dangerous, user approves before risky ops
Cron/Scheduled agents — schedule recurring agent tasks with cron expressions or one-shot wakeups
Jupyter Notebook support — read, edit, add, and remove cells in .ipynb files
TodoWrite task tracking — ephemeral session-scoped task management
Plan Mode — structured planning with confirmation gates before execution
HTTP Proxy & CA certs — enterprise proxy and custom certificate support
File watching — live detection of external file changes via chokidar
Keyboard shortcuts — configurable keybindings (~/.ultraagent/keybindings.json)
Compaction transparency — user notification when context is compressed
Agentic loop — LLM autonomously selects tools, executes, observes, iterates (up to 30 rounds)
Sub-agents — spawn parallel agents for concurrent tasks
Persistent memory — context that survives across sessions
Apple Silicon optimized — curated model recommendations for M-series chips
Error tracking — detects repeated error patterns and provides targeted correction prompts
Real-time streaming — token-by-token output with repetition detection
Input queue — type-ahead while agent generates, prompts are queued and processed sequentially
Auto-init — automatically sets up .ultraagent/ on first run in any project
Text-to-ToolCall parser — catches local models that leak tool calls as JSON text

New in v3.0

Diff Preview — colored unified diffs with accept/reject before any file change lands (like Claude Code / Plandex)
Git Auto-Commit — Aider-style: every file edit gets its own commit, every change is instantly revertable
Repo Map — Aider-style codebase indexing: extracts function signatures, classes, types for all supported languages
Conversation Forking — try a different approach with /fork, switch between branches with /branches and /restore
Multimodal Input — send images/screenshots to vision-capable models (LLaVA, Qwen-VL, Llama 3.2 Vision)
Docker Sandbox — optional sandboxed bash execution in Docker containers for safe command execution
Web Browsing — real browser with Playwright: navigate, click, type, screenshot, extract — for SPAs and JS-rendered content
Stream Renderer — markdown-aware live rendering: code blocks, headers, lists, inline formatting
Ink TUI — optional React-based terminal UI (Ink) with status bar, split-pane, and interactive input (opt-in)

Getting Started

Install from npm (recommended)

npm install -g ultraagent

That's it. Now go to any project and start:

cd my-project
ua chat              # auto-initializes .ultraagent/ on first run

UltraAgent auto-detects your project type, creates the config, and starts the agent.

Full Setup (with local LLM configuration)

npm install -g ultraagent
ua local setup           # auto-detects hardware, backend, sets optimal config
cd my-project
ua init                  # set up .ultraagent/, scan project, update .gitignore
ua chat                  # start coding

The local setup command automatically:

Detects your chip, RAM, GPU cores
Finds running backends (vMLX, LM Studio, Ollama, MLX-LM, vLLM, llama.cpp)
Recommends the best model for your hardware
Sets optimal context length, KV-cache, batch size, flash attention
Verifies the connection

What `ua init` creates

my-project/
├── .ultraagent/
│   ├── config.json              ← Project-specific settings
│   ├── mcp.json                 ← MCP server configuration
│   ├── skills/                  ← Custom project skills
│   │   └── project-conventions/
│   │       └── SKILL.md         ← Example skill (customize this)
│   └── plugins/                 ← Custom tool plugins
│       └── README.md
├── docs/
│   └── scan.md                  ← Project scan (auto-injected into every LLM request)
└── .gitignore                   ← Updated with .ultraagent/plugins/, .worktrees/

Auto-init: If you skip ua init, it runs automatically on your first ua chat in any project.

Quick Start (from source)

git clone https://github.com/zurd46/UltraAgent.git
cd UltraAgent
npm install

npm run dev -- local setup    # auto-detects hardware, backend, sets optimal config
npm run dev -- chat           # start coding

Manual Setup

ua config set provider ollama
ua config set localModel qwen2.5-coder:14b
ua chat

Update

npm install -g ultraagent@latest

Alias: ua works everywhere instead of ultraagent.

Supported Backends

| Backend | Provider | Default Port | Description | |:--------|:---------|:-------------|:------------| | vMLX | vmlx | localhost:8000 | Fastest MLX engine for Mac, OpenAI + Anthropic API, 400+ tok/s | | LM Studio | local-openai | localhost:1234 | Desktop app with Metal GPU, automatic prompt caching | | Ollama | ollama | localhost:11434 | Simplest setup, native Metal GPU support | | MLX-LM | mlx | localhost:8080 | Apple's ML Framework, fastest on M-chips | | vLLM | vllm | localhost:8001 | High-performance inference engine, OpenAI-compatible | | llama.cpp | local-openai | localhost:8080 | Maximum control, OpenAI-compatible API |

Recommended Models (Apple Silicon)

| RAM | Model | Size | Quality | Speed | |:----|:------|:-----|:--------|:------| | 64GB+ | Llama 3.3 70B (Q4) | 40 GB | Excellent | Slow | | 32GB | Qwen 2.5 Coder 32B (Q4) | 18 GB | Excellent | Medium | | 32GB | Codestral 22B | 13 GB | Excellent | Medium | | 16GB | Qwen 2.5 Coder 14B | 9 GB | Excellent | Fast | | 16GB | DeepSeek Coder V2 16B | 9 GB | Excellent | Fast | | 8GB | Llama 3.1 8B | 4.7 GB | Good | Fast | | 4GB | Qwen 2.5 Coder 3B | 1.9 GB | Basic | Fast |

ultraagent models --ram 32    # show recommendations for your RAM

LM Studio Optimal Settings

When using LM Studio, set these in Settings > Server for best performance:

| Setting | Recommended | Why | |:--------|:------------|:----| | Flash Attention | On (not Auto!) | Reduces memory for long contexts | | KV Cache Quantization | Q8_0 (16-32GB) / F16 (64GB+) | Halves KV-cache memory, minimal quality loss | | GPU Offload | Max (all layers) | Full Metal GPU acceleration | | Prompt Caching | Auto | LM Studio caches automatically |

ultraagent local setup sets these values automatically based on your hardware.

Commands

ultraagent <command> [options]

Project

| Command | Description | |:--------|:------------| | init | Set up UltraAgent in a project (.ultraagent/, scan, gitignore) — auto-runs on first chat | | new | Interactive wizard to scaffold a complete new project | | scan | Deep-scan the project > docs/scan.md (auto-injected into LLM context) | | create | Create a project from template | | analyze | Analyze an existing project's structure and codebase |

Agent

| Command | Description | |:--------|:------------| | chat | Interactive session with full tool access and type-ahead input queue | | run <prompt> | Execute a single prompt non-interactively | | edit | Edit or refactor existing code | | plan <prompt> | Generate a structured project plan | | code <prompt> | Generate code from a prompt |

Quality & Diagnostics

| Command | Description | |:--------|:------------| | verify | Run verification checks (tests, build, lint, types) | | router | Show smart model router status and available models |

Scheduled Agents

| Command | Description | |:--------|:------------| | cron list | List all scheduled agents and pending wakeups | | cron create <name> <cron> <prompt> | Create a scheduled agent (e.g. "every 30m", "daily") | | cron delete <id> | Delete a scheduled agent | | cron toggle <id> <on\|off> | Enable or disable a schedule |

Configuration & Local LLM

| Command | Description | |:--------|:------------| | config show | Display current configuration | | config set <key> <value> | Update a configuration value | | config setup | Setup wizard (redirects to local setup) | | config reset | Reset to defaults | | models | Recommended local models for your hardware | | local setup | Auto-detect hardware & set optimal config | | local status | Check local LLM server health + current settings | | local pull [model] | Download a model via Ollama | | local list | List installed Ollama models |

Tool System

UltraAgent ships with 49 tools that the LLM invokes autonomously during a session. Each tool has an assigned permission level.

Permission Model

| Level | Behavior | |:------|:---------| | Safe | Executes immediately — no confirmation needed | | Confirm | Requires user approval (file writes, commits) | | Dangerous | Requires approval with highlighted warning (shell commands) |

When prompted: [y]es · [n]o · [a]lways allow · [d]eny always

Tools

| Tool | Permission | Description | |:-----|:-----------|:------------| | read_file | Safe | Read file contents with line numbers, offset, limit | | write_file | Confirm | Create or overwrite files; auto-creates directories | | edit_file | Confirm | Precise string replacement via diff-based editing | | bash | Dangerous | Execute shell commands with timeout and output limits | | glob | Safe | Find files by glob pattern | | grep | Safe | Search file contents with regex and context lines |

| Tool | Permission | Description | |:-----|:-----------|:------------| | git_status | Safe | Working tree status | | git_diff | Safe | Staged and unstaged changes | | git_log | Safe | Recent commit history | | git_commit | Confirm | Create a commit | | git_branch | Confirm | Create, switch, or list branches | | git_merge | Dangerous | Merge branches | | git_revert | Confirm | Revert a commit | | git_stash | Confirm | Stash/pop changes | | git_cherry_pick | Confirm | Cherry-pick commits | | git_rebase | Dangerous | Rebase branches |

| Tool | Permission | Description | |:-----|:-----------|:------------| | gh_pr | Confirm | Create, list, merge, review Pull Requests (requires gh CLI) | | gh_issue | Confirm | Create, list, close, comment on Issues (requires gh CLI) |

| Tool | Permission | Description | |:-----|:-----------|:------------| | sub_agent | Confirm | Spawn a sub-agent for independent tasks | | sub_agent_batch | Confirm | Run multiple sub-agents in true parallel | | sub_agent_status | Safe | Check status and results of running sub-agents | | memory_save | Safe | Persist context to memory | | memory_search | Safe | Search persistent memory | | memory_delete | Safe | Delete a memory entry | | web_search | Safe | Search the web | | web_fetch | Safe | Fetch content from a URL | | browser | Confirm | Real browser (Playwright): navigate, screenshot, click, type, evaluate, extract | | plan_create | Safe | Create a structured task plan | | plan_status | Safe | Show plan progress | | plan_update | Safe | Update task status |

| Tool | Permission | Description | |:-----|:-----------|:------------| | skill | Safe | Load and execute Superpowers skills by name or topic | | git_worktree | Confirm | Create, list, and manage isolated git worktrees | | background_agent | Confirm | Spawn non-blocking background agents, check status, get results | | code_review | Confirm | Dispatch sub-agent code review (quality, spec compliance, full) | | verify | Safe | Run verification checks (tests, build, lint, type-check) |

| Tool | Permission | Description | |:-----|:-----------|:------------| | todo_add | Safe | Add a new task to the session todo list | | todo_update | Safe | Update task status (not-started, in-progress, completed) | | todo_remove | Safe | Remove a task from the list | | todo_list | Safe | List all tasks with status | | todo_clear | Safe | Clear all tasks |

| Tool | Permission | Description | |:-----|:-----------|:------------| | cron_create | Confirm | Create a scheduled agent with cron expression ("every 5m", "daily", "0 9 * * 1") | | cron_list | Safe | List all scheduled agents and pending wakeups | | cron_delete | Confirm | Delete a scheduled agent | | cron_toggle | Safe | Enable or disable a schedule | | schedule_wakeup | Confirm | Schedule a one-shot delayed agent execution (60-3600s) |

| Tool | Permission | Description | |:-----|:-----------|:------------| | notebook_read | Safe | Read a .ipynb file — all cells or a specific cell by index, with outputs | | notebook_edit | Confirm | Edit a cell's source by index, optionally change cell type | | notebook_add_cell | Confirm | Add a new cell (code/markdown/raw) at a specific position | | notebook_remove_cell | Confirm | Remove a cell by index |

Superpowers

UltraAgent v2.0 introduces Superpowers — a skill system that gives the agent structured workflows for complex tasks. Skills are loaded automatically based on context or can be invoked explicitly.

Built-in Skills (14)

| Skill | Description | |:------|:------------| | brainstorming | Structured brainstorming with divergent/convergent phases | | writing-plans | Create detailed implementation plans | | executing-plans | Execute plans step-by-step with progress tracking | | test-driven-development | TDD workflow: red-green-refactor cycle | | systematic-debugging | Structured debugging with hypothesis testing | | verification-before-completion | Evidence-based completion gates | | requesting-code-review | Prepare and dispatch code for review | | receiving-code-review | Process and apply review feedback | | using-git-worktrees | Isolated workspaces for parallel branches | | subagent-driven-development | Coordinate multiple sub-agents on a task | | dispatching-parallel-agents | Fan-out work to parallel agents | | finishing-a-development-branch | Branch cleanup, squash, merge workflow | | writing-skills | Create custom Superpowers skills | | using-superpowers | Meta-skill: discover and use available skills |

Custom Skills

Add project-specific skills in .ultraagent/skills/ or global skills in ~/.ultraagent/skills/. Each skill is a directory with a SKILL.md file containing frontmatter and instructions.

Smart Core Modules

Model Router

Automatically routes tasks to the optimal model based on complexity:

| Tier | Use Case | Model Size | |:-----|:---------|:-----------| | Fast | Simple file reads, status checks | 7-14B | | Standard | Code edits, feature implementation | 14-32B | | Capable | Architecture, complex debugging | 32-70B |

Quality Gates

Validates agent outputs and catches common local LLM failure patterns:

False success claims (says "done" but nothing changed)
Generic/hallucinated responses
Test failures ignored
Hallucinated file paths

Reasoning Engine

Injects chain-of-thought scaffolding tailored to task complexity, improving local LLM reasoning quality by 20-40%.

Error Tracker

Detects repeated error patterns (file-not-found, edit-mismatch, test-failure) and provides targeted correction prompts instead of generic retries.

Verification

Runs automated checks (tests, build, lint, type-check) before accepting completion claims. Auto-detects project type (Node/Rust/Python/Go).

Repo Map

Aider-style codebase indexing that extracts function signatures, class definitions, type declarations, and exports using regex patterns. Generates a compact map injected into the LLM context, giving it a high-level overview of the entire codebase without reading every file. Supports TypeScript, JavaScript, Python, Rust, Go, Java, C#, and Swift.

Diff Preview

Every file modification (edit_file, write_file) shows a colored unified diff before applying changes. The user can:

[a]ccept — apply this change
[r]eject — skip this change
[A]lways accept — auto-accept all future diffs this session

Git Auto-Commit

Aider-style automatic commits: every successful file edit gets its own git commit with a descriptive conventional commit message. Every change is instantly revertable via git revert. Enable with autoCommit: true.

Conversation Forking

Try different approaches without losing progress:

/fork [label] — create a branch from the current conversation
/branches — see all forks and their message counts
/restore <id> — switch back to any branch

Multimodal Input

Send images and screenshots to vision-capable models:

Inline: analyze this screenshot /path/to/image.png
Command: /image /path/to/file.png what do you see?
Screenshot: /screenshot (macOS)

Works with LLaVA, Qwen-VL, Llama 3.2 Vision, MiniCPM-V, and other vision models.

Docker Sandbox

Optional containerized bash execution for safety:

ultraagent config set sandbox docker
ultraagent config set sandboxImage node:22-slim

Commands run in an isolated Docker container with:

Project dir mounted as /workspace
2GB RAM limit, 2 CPUs, 256 max processes
Optional network isolation

Web Browsing (Playwright)

Real browser for JavaScript-heavy websites:

browser navigate https://example.com    # open + extract text
browser screenshot                       # capture page
browser click "button.submit"           # click element
browser type "input#search" "query"     # fill form
browser extract                          # structured data (links, forms, headings)
browser evaluate "document.title"       # run JS

Requires: npm install playwright && npx playwright install chromium

Project Creation

ultraagent new

Interactive wizard that scaffolds complete, production-ready projects.

# Non-interactive
ultraagent new --name my-app --type fullstack --stack nextjs-prisma-pg --prompt "E-commerce platform"

| Option | Description | |:-------|:------------| | -n, --name <name> | Project name | | -t, --type <type> | Project type | | -s, --stack <stack> | Tech stack | | -d, --dir <path> | Parent directory | | -p, --prompt <prompt> | Project description |

| Type | Stacks | |:-----|:-------| | webapp | React+Vite, Next.js, Vue 3+Vite, Svelte, Astro | | fullstack | Next.js+Prisma+PG, React+Express, Vue+FastAPI, T3 Stack | | api | Express, Fastify, NestJS, FastAPI, Hono, Go+Gin, Rust+Actix | | cli | TypeScript+Commander, Python+Click, Rust+Clap, Go+Cobra | | library | TypeScript npm, Python PyPI, Rust Crate | | mobile | React Native+Expo, React Native Bare | | desktop | Electron+React, Tauri+React | | monorepo | Turborepo, Nx, pnpm Workspaces | | ai-ml | Python+LangChain, Python+PyTorch, TypeScript+LangChain | | custom | Describe what you need |

Docker + docker-compose
CI/CD (GitHub Actions)
Testing (Unit + Integration)
Linting + Formatting (ESLint / Prettier)
Authentication
Database setup
API Documentation (OpenAPI / Swagger)
Environment variables (.env)
Logging
Error handling / Monitoring

Output

A complete project with directory structure, config files, typed source code, test setup, README, .gitignore, build scripts, initialized git repo, installed dependencies, and auto-generated docs/scan.md.

Project Scanner

ultraagent scan            # scan current project
ultraagent scan --force    # force re-scan
ultraagent scan --dir /path/to/project

Creates docs/scan.md — a comprehensive project analysis that is automatically injected into every LLM request, giving the model full project context.

| Section | Content | |:--------|:--------| | Overview | Name, type, language, framework, package manager, file count, LOC | | Git | Branch, remote, last commit | | Directory Structure | Full tree (4 levels) | | Key Files | Entry points with roles | | Dependencies | All deps with versions | | Scripts | npm scripts with commands | | Configuration | tsconfig, eslint, prettier, docker, CI/CD, etc. | | Test Files | All test file paths | | API Routes | Detected route files | | Environment Variables | From .env.example | | Lines of Code | Total and per extension | | Architecture | AI-generated analysis of patterns and conventions |

Cache: scan.md is valid for 24 hours. Use --force to regenerate.

Agent Modes

| Mode | Purpose | |:-----|:--------| | chat | General assistance — full tool access, memory, planning | | create | Project setup — structure, dependencies, config | | analyze | Code review — architecture, patterns, quality | | edit | Refactoring — precise edits, minimal changes | | plan | Task planning — phases, dependencies, risks | | code | Code generation — typed, tested, convention-aware |

Switch modes in-session with /mode <mode>.

Session Commands

| Command | Description | |:--------|:------------| | /help | Available commands | | /new | Create a new project | | /scan | Scan project > docs/scan.md | | /mode <mode> | Switch agent mode | | /dir <path> | Change working directory | | /clear | Clear terminal | | /status | Session status and config | | /history | Conversation history | | /undo | Undo last file change | | /tokens | Token usage | | /plan | Current plan progress | | /fork [label] | Fork conversation at current point (try different approach) | | /branches | List all branches/forks of the current session | | /screenshot | Take a screenshot and send to agent (macOS) | | /image <path> | Send an image file to the agent for analysis | | /sessions | List saved sessions | | /restore [id] | Restore a previous session | | /save | Force save current session | | /mcp | Show MCP server status | | /exit | End session |

Hook System

Automate actions before or after tool calls. Config lives in .ultraagent/hooks.json.

{
  "hooks": [
    {
      "timing": "after",
      "tool": "write_file",
      "command": "npx prettier --write ${file}",
      "enabled": true
    },
    {
      "timing": "after",
      "tool": "edit_file",
      "command": "npm test",
      "enabled": true
    }
  ]
}

| Field | Type | Description | |:------|:-----|:------------| | timing | before | after | When to run | | tool | string | Tool name (* for all) | | command | string | Shell command | | blocking | boolean | Abort tool call on hook failure (before hooks only) | | enabled | boolean | Active state |

Variables: ${file} (file path), ${tool} (tool name)

Context Injection

UltraAgent automatically enriches every LLM request with project context:

| Source | Description | |:-------|:------------| | Project detection | Language, framework, package manager, scripts | | docs/scan.md | Full project scan | | Persistent memory | Saved context from prior sessions | | Active plan | Task plan with status | | ULTRAAGENT.md | Project-specific custom instructions | | Git status | Branch, changes, recent commits |

Priority: Mode prompt > Project context > scan.md > Memories > Plan > History

Custom Instructions

Create ULTRAAGENT.md in your project root:

# Project Instructions
- TypeScript monorepo using pnpm
- Run `pnpm test` after changes
- Use conventional commits

Global instructions: ~/.ultraagent/instructions.md

Configuration

UltraAgent uses a layered config system. All settings can be configured via .env, config.json, or CLI commands.

Priority Chain

.env (project or ~/.ultraagent/.env) → highest priority
.ultraagent/config.json (project)    → project-specific
~/.ultraagent/config.json (global)   → machine-wide
defaults                             → built-in fallback

Quick Setup

ultraagent local setup                              # auto-detect hardware & configure
ultraagent config set provider ollama                # write to .ultraagent/config.json
ultraagent config set localModel qwen2.5-coder:14b   
ultraagent config set autoCommit true --global       # write to ~/.ultraagent/config.json
ultraagent config show                               # view resolved config with source indicators
ultraagent config reset                              # delete project config.json

.env File

The recommended way to configure UltraAgent. Create a .env in your project root:

# Provider: ollama | local-openai | mlx | vmlx | vllm
ULTRAAGENT_PROVIDER=ollama

# Local LLM
ULTRAAGENT_LOCAL_BASE_URL=http://localhost:11434
ULTRAAGENT_LOCAL_MODEL=qwen2.5-coder:14b
ULTRAAGENT_LOCAL_CONTEXT_LENGTH=32768
ULTRAAGENT_LOCAL_TEMPERATURE=0.7
ULTRAAGENT_LOCAL_GPU_LAYERS=-1
ULTRAAGENT_LOCAL_BATCH_SIZE=512
ULTRAAGENT_LOCAL_FLASH_ATTENTION=true
ULTRAAGENT_LOCAL_KV_CACHE_TYPE=q8_0

# Features
ULTRAAGENT_AUTO_COMMIT=false
ULTRAAGENT_SANDBOX=none
ULTRAAGENT_REASONING_MODE=auto
ULTRAAGENT_QUALITY_GATES=true

# Network
ULTRAAGENT_PROXY_URL=
ULTRAAGENT_CA_CERT_PATH=

Run ultraagent config env to generate a full .env template with all available settings.

| Option | Default | Description | |:-------|:--------|:------------| | model | qwen2.5-coder:14b | Model identifier | | provider | ollama | ollama, local-openai, vmlx, mlx, or vllm | | maxTokens | 8192 | Max output tokens per response | | streaming | true | Real-time token streaming | | theme | default | default · minimal · verbose | | history | true | Keep conversation history | | maxSubAgents | 5 | Max concurrent sub-agents | | localBaseUrl | (auto) | Local LLM server URL (auto-detected per provider) | | localModel | — | Local model name | | localGpuLayers | -1 | GPU layers (-1 = all) | | localContextLength | 32768 | Context window size | | localTemperature | 0.7 | Sampling temperature | | localBatchSize | 512 | Inference batch size | | localFlashAttention | true | Enable flash attention | | localKvCacheType | q8_0 | KV-cache quantization (f16, q8_0, q4_0) | | reasoningMode | auto | Reasoning scaffolds: auto · always · never | | qualityGates | true | Enable output validation and retry | | maxQualityRetries | 2 | Max retries on quality gate failure (0-3) | | proxyUrl | — | HTTP/HTTPS proxy URL for enterprise environments | | caCertPath | — | Path to custom CA certificate PEM file | | autoCommit | false | Aider-style auto-commit after every file change | | sandbox | none | Bash execution mode: none (host) or docker (containerized) | | sandboxImage | node:22-slim | Docker image for sandboxed execution |

Architecture

src/
├── cli.ts                    # CLI entry point (Commander.js)
├── index.ts                  # Programmatic API exports
├── commands/                 # Command implementations
│   ├── init.ts               #   Project initialization (auto-runs on first chat)
│   ├── new.ts                #   Project creation wizard
│   ├── scan.ts               #   Project scanner
│   ├── chat.ts               #   Interactive session
│   ├── edit.ts               #   Code editing
│   ├── run.ts                #   Single prompt execution
│   ├── config-cmd.ts         #   Configuration management
│   ├── local.ts              #   Local LLM auto-setup
│   ├── models.ts             #   Model recommendations
│   └── skills-commands/      #   Skill-related CLI commands
├── core/                     # Core engine
│   ├── agent-factory.ts      #   Agent loop, streaming, context injection
│   ├── session.ts            #   Interactive REPL with slash commands
│   ├── context.ts            #   Git context & project instructions
│   ├── context-manager.ts    #   Token-aware history management
│   ├── local-llm.ts          #   Local LLM provider adapters
│   ├── planner.ts            #   Plan creation & task tracking
│   ├── hooks.ts              #   Pre/post tool execution hooks
│   ├── token-tracker.ts      #   Usage tracking
│   ├── undo.ts               #   File change rollback
│   ├── model-router.ts       #   Smart model routing by task complexity
│   ├── quality-gate.ts       #   Output validation & failure pattern detection
│   ├── reasoning-engine.ts   #   Chain-of-thought scaffolding for local LLMs
│   ├── verification.ts       #   Evidence-based completion checks
│   ├── error-tracker.ts      #   Repeated error detection & correction
│   ├── background-agent.ts   #   Non-blocking background agent manager
│   ├── cron-scheduler.ts     #   Cron/scheduled agent execution engine
│   ├── todo.ts               #   Ephemeral session task management
│   ├── code-reviewer.ts      #   Two-stage code review dispatcher
│   ├── skills.ts             #   Superpowers skill loader & manager
│   ├── mcp-client.ts         #   MCP client for external tool integration
│   ├── plugin-loader.ts      #   Custom plugin loading
│   ├── code-validator.ts     #   Automatic code validation
│   ├── session-store.ts      #   Session persistence & restoration (with forking)
│   ├── auto-commit.ts        #   Aider-style auto-commit after edits
│   ├── repo-map.ts           #   Codebase indexing (function signatures, classes, types)
│   ├── multimodal.ts         #   Image/screenshot input support
│   └── sandbox.ts            #   Docker-based sandboxed execution
├── skills/                   # Superpowers skill definitions (14 skills)
│   ├── brainstorming/
│   ├── writing-plans/
│   ├── executing-plans/
│   ├── test-driven-development/
│   ├── systematic-debugging/
│   ├── verification-before-completion/
│   ├── requesting-code-review/
│   ├── receiving-code-review/
│   ├── using-git-worktrees/
│   ├── subagent-driven-development/
│   ├── dispatching-parallel-agents/
│   ├── finishing-a-development-branch/
│   ├── writing-skills/
│   └── using-superpowers/
├── tools/                    # Tool system (48 tools)
│   ├── base.ts               #   Registry, permission levels, definitions
│   ├── permissions.ts        #   Permission manager & session allowlist
│   ├── read.ts / write.ts / edit.ts / bash.ts / glob.ts / grep.ts
│   ├── git.ts                #   10 git tools + 2 GitHub tools (gh_pr, gh_issue)
│   ├── sub-agent.ts          #   Parallel sub-agent execution
│   ├── memory.ts             #   memory_save, memory_search, memory_delete
│   ├── web.ts                #   web_search, web_fetch
│   ├── todo.ts               #   TodoWrite task-tracking (5 tools)
│   ├── cron.ts               #   Cron/scheduled agent tools (5 tools)
│   ├── notebook.ts           #   Jupyter notebook tools (4 tools)
│   ├── skill.ts              #   Superpowers skill loading tool
│   ├── worktree.ts           #   Git worktree management
│   ├── background-agent.ts   #   Background agent tool interface
│   ├── review.ts             #   Code review dispatch tool
│   ├── verify.ts             #   Verification check tool
│   └── index.ts              #   Barrel exports & default registry
├── ui/                       # Terminal UI
│   ├── theme.ts              #   Colors, gradients, ASCII banner
│   ├── spinner.ts            #   Loading spinners
│   └── permission-prompt.ts  #   Interactive permission dialog
└── utils/
    ├── config.ts             #   Zod-validated config with env support
    ├── hardware.ts           #   Apple Silicon hardware detection
    ├── backend-detect.ts     #   LLM backend auto-detection (vMLX, Ollama, LM Studio, MLX, vLLM)
    ├── model-recommender.ts  #   RAM-based model recommendation engine
    ├── proxy.ts              #   HTTP proxy & custom CA certificates
    ├── file-watcher.ts       #   Live file change detection (chokidar)
    ├── keybindings.ts        #   Configurable keyboard shortcuts
    └── files.ts              #   Project detection & analysis

Tech Stack

| Category | Technology | |:---------|:-----------| | Language | TypeScript 5.7 (ES Modules) | | AI | LangChain · LangGraph | | LLM Backends | vMLX · Ollama · LM Studio · MLX-LM · vLLM · llama.cpp | | CLI | Commander.js · Inquirer.js | | Validation | Zod | | UI | Chalk · Ora · Boxen · Gradient-string · cli-table3 | | Testing | Vitest |

Development

npm run build        # compile TypeScript -> dist/
npm run dev          # run via tsx (no build)
npm start            # run compiled output
npm test             # run tests
npm run test:watch   # tests in watch mode
npm run clean        # remove dist/

Contributing

Contributions are welcome. Please open an issue first to discuss what you'd like to change.

Fork the repository
Create your branch (git checkout -b feature/my-feature)
Commit your changes
Push to the branch
Open a Pull Request

Changelog

v2.1.0

vMLX support — first-class provider for the fastest MLX inference engine on Mac (provider: vmlx, port 8000), with continuous batching, KV cache quantization, prefix caching, and 400+ tok/s
vLLM support — first-class provider for high-performance vLLM inference engine (provider: vllm, port 8001)
MLX-LM as dedicated provider — mlx is now a standalone provider (no longer grouped under local-openai)
Cron/Scheduled agents — 5 new tools (cron_create, cron_list, cron_delete, cron_toggle, schedule_wakeup) with persistent schedules, simplified cron syntax ("every 5m", "daily"), one-shot wakeups, and CLI commands (ua cron list/create/delete/toggle)
Jupyter Notebook tools — 4 new tools (notebook_read, notebook_edit, notebook_add_cell, notebook_remove_cell) for reading/editing .ipynb files with output rendering
TodoWrite task tracking — 5 new tools for ephemeral session-scoped task management
Plan Mode — structured planning with confirmation gates before tool execution
HTTP Proxy & Custom CA Certificates — enterprise proxy support via config or env vars
File watching — live detection of external file changes via chokidar (always-on)
Keyboard shortcuts — configurable keybindings in ~/.ultraagent/keybindings.json
Compaction transparency — user notification with token estimates when context is compressed
48 tools total (up from 36 in v2.0.0)

v2.0.0

Superpowers skill system — 14 built-in skills (TDD, debugging, code review, worktrees, brainstorming, and more) with support for custom project-local and global skills
Smart model routing — automatically selects the optimal model (7-70B) per task complexity tier (fast/standard/capable)
Quality gates — validates agent outputs and catches local LLM failure patterns (false success, hallucinated paths, ignored test failures)
Reasoning engine — chain-of-thought scaffolding that improves local LLM reasoning by 20-40%
Code review tool — sub-agent-powered two-stage review (spec compliance + code quality) with severity levels
Verification gates — evidence-based completion with automated test/build/lint/type-check for Node/Rust/Python/Go
Background agents — non-blocking parallel agents with status tracking and abort control
Git worktrees — isolated workspaces for parallel development branches with auto-setup
Error tracker — detects repeated error patterns and provides targeted correction prompts
7 new tools — skill, git_worktree, background_agent, code_review, verify (36 total)
Input queue — type-ahead while agent generates, prompts queued and processed sequentially
Auto-init — ua init sets up .ultraagent/ with config, skills, plugins, MCP; auto-runs on first ua chat
Text-to-ToolCall parser — catches local models (qwen, etc.) that output tool calls as JSON text
ua verify — CLI command to run project verification checks
ua router — CLI command to inspect smart model routing status
MCP client — external tool integration via Model Context Protocol
Plugin loader — custom tool plugins for extensibility
Session persistence — session store with auto-save and restore

v1.1.0

Smart context management — automatically reduces tools and prompt size to fit within model's context window
Improved output formatting — cleaner responses with structured formatting rules (status icons, tables, tree output)
11 new tools — git_merge, git_revert, git_stash, git_cherry_pick, git_rebase, gh_pr, gh_issue, sub_agent_batch, sub_agent_status, plan_create/status/update
Context overflow detection — clear error message with actionable solutions when context window is too small
Tool token estimation — dynamic tool binding based on available context budget

v1.0.0

Initial release with 18 tools, agentic loop, sub-agents, memory, project scanner, auto-setup

License

MIT

Built by Daniel Zurmuhle

Made in Switzerland