makhlouf
v3.1.0
Published
Standalone TypeScript CLI that acts as a senior banking software architect
Readme
Makhlouf v3.0.0
The most comprehensive standalone CLI for building enterprise Django banking systems with any LLM provider.
Created by Ahmed Makhlouf — 60,000+ lines of banking domain expertise, 25 commands, 13 AI agents, 50+ knowledge files, multi-provider LLM support (Anthropic, OpenRouter, Ollama), and local session persistence.
What is Makhlouf?
Makhlouf is a Claude Code plugin that turns Claude into a senior banking software architect. It knows how to build PCI DSS-compliant, enterprise-grade Django banking systems from scratch — with proper ACID ledger engines, maker-checker approvals, encrypted PII, multi-tenant IAM, and 35 banking domain apps.
One command to scaffold. One command to build. One command to ship.
Key Features
| Category | What You Get | |----------|-------------| | 25 Commands | Full lifecycle: scaffold → build → test → QA → audit → ship | | 13 AI Agents | Parallel app builders, QA agents, security testers, persona simulators | | 18 Backend Decisions | Centralized from Day 1: BaseModel, IAM, Ledger, Pipeline, Approval, Forms | | 16 Frontend Decisions | React 19 + TypeScript + Tailwind 4 + shadcn/ui with HMAC signing | | 35 Banking Apps | Accounts, transfers, cards, loans, FX, fees, compliance, KYC, and more | | 10-Wave Build | Dependency-ordered parallel construction with contract verification | | 7-Layer Tests | Unit → service → integration → contract → security → performance → chaos | | 352+ Tests | Plugin self-test suite covering all components | | Cross-Session Memory | Decisions persist across Claude sessions via .forge/intelligence/ | | Project Style Adaptation | Learns and applies project-specific coding conventions | | Self-Learning | Risk heatmap, coupling map, predictions — gets smarter each build | | PCI DSS Level 1 | 12 requirements mapped to code checks | | Multi-Persona Sim | 50+ AI actors playing merchant/teller/admin/compliance roles |
Commands
Phase 15 ports all 25 makhlouf-* commands from the v2.1.0 plugin into the standalone CLI. After setting ANTHROPIC_API_KEY, run any of the commands below (plus chat as the raw escape hatch). Run makhlouf <command> --help for per-command flags and examples.
Build
makhlouf new— Initialize a new Django banking project from scratchmakhlouf execute— Build ALL apps wave-by-wave with parallel agentsmakhlouf app— Build a single Django app from scratch with TDDmakhlouf feature— Build a feature spanning multiple Django apps end-to-endmakhlouf extend— Add a new feature to ONE existing Django appmakhlouf frontend— Build React frontend components for a Django banking appmakhlouf wire— Connect Django backend to React frontend with typed clients
Fix and quality
makhlouf fix— Fix a bug with multi-agent diagnosis and parallel fix attemptsmakhlouf test— Run the automated 7-layer test suite (code-level, minutes)makhlouf qa— Exhaustive browser E2E, API smoke, and visual regression QAmakhlouf audit— Security audit: permissions, encryption, PII, compliancemakhlouf clean— Detect code smells: duplicates, dead code, orphan migrationsmakhlouf refactor— Execute structural code changes: split apps, move models
Ship and observe
makhlouf status— Show Makhlouf build progress dashboardmakhlouf scan— Rebuild .forge/contracts/ from live code after manual editsmakhlouf migrate— Safe migration management with zero-downtime patternsmakhlouf ship— Run full test suite, generate report, and create a pull requestmakhlouf docs— Auto-generate API docs and architecture diagrams from live codemakhlouf run— Run the banking system locally and verify service health
Intelligence and maintenance
makhlouf simulate— Multi-persona business simulation across merchant, teller, adminmakhlouf study— Deep-read your own codebase to build contracts and conventionsmakhlouf learn— Study an external codebase and compare it against your projectmakhlouf integrate— Replace stub providers with real external services (Twilio, etc.)makhlouf test-report— Show last test results with ship-readiness verdictmakhlouf update— Update Makhlouf plugin to the latest version from GitHub
Escape hatch
makhlouf chat "<prompt>"— Run a single agent with any prompt (bypasses the command catalog, from Phase 13)
Non-interactive mode
Pass --no-interactive to suppress the spinner and ANSI colors for CI-friendly output:
makhlouf --no-interactive status
CI=1 makhlouf status # auto-detected when CI env is set or stdout is not a TTYThe flag keeps the raw token stream, tool call lines, and usage footer intact so CI logs remain meaningful. It is globally applicable and propagates to every catalog command plus chat.
Quick Start
1. Install
# Clone the plugin
git clone https://github.com/ahmedmk/makhlouf.git ~/.claude/skills/makhlouf
# Create symlinks for Claude Code discovery
cd ~/.claude/skills
for skill in $(ls -d makhlouf/skills/makhlouf-*/); do
ln -sfn "$skill" "$(basename $skill)"
done
# Make scripts executable
chmod +x ~/.claude/skills/makhlouf/bin/*.sh2. Start a New Session
# Open Claude Code in any directory
claude
# Initialize a new banking project
/makhlouf-new
# Build all apps (parallel agents)
/makhlouf-execute
# Run tests
/makhlouf-test
# Ship it
/makhlouf-ship3. Learn an Existing Codebase
# Point at any Django project
cd /path/to/existing-project
/makhlouf-study
# Now all commands work with context
/makhlouf-extend # add features
/makhlouf-fix # fix bugs
/makhlouf-audit # security check4. Try the v3.0.0 standalone CLI with any provider
Makhlouf v3.0.0 ships as a standalone npm-installable CLI with built-in support
for Anthropic Claude, OpenRouter, and local Ollama. After npm install && npm run build
(or npx tsx src/cli.ts for development), the five built-in aliases are ready to use
out of the box:
# Anthropic Claude (requires ANTHROPIC_API_KEY)
makhlouf chat "Explain the double-entry ledger service" --model fast # claude-haiku-4-5
makhlouf chat "Design a payments microservice" --model smart # claude-opus-4-6
makhlouf chat "Audit the approval workflow" --model balanced # claude-sonnet-4-6
# OpenRouter (requires OPENROUTER_API_KEY; free tier available)
makhlouf chat "Translate error messages to Spanish" --model cheap # llama-3.3-70b-instruct
# Ollama (requires `ollama serve` running locally; no key)
makhlouf chat "Summarize this commit" --model local # llama3.2Save a conversation to resume it later:
makhlouf chat "Remember my project is a banking system" --save context-1
makhlouf chat "What architecture did we choose?" --resume context-1See the Providers, Model Aliases, and Session Persistence sections below for the full reference.
Architecture
Makhlouf enforces 18 centralized backend decisions and 16 frontend decisions from Day 1:
Backend (Django)
- BaseModel — UUID v7, timestamps, soft delete for ALL models
- HasIAMPermission — centralized auth,
iam_resource = model_name_plural - Middleware Stack — 15-layer security stack in exact order
- Correlation ID — every request, service, task, log entry
- Audit Logging —
log_action()on every mutation - API Envelope —
api_response()for all endpoints - Encryption — KEK/DEK for all PII fields
- SystemConfig — no hardcoded values, ever
- Celery Queues — 5 queues with correlation ID propagation
- Error Handler — centralized with domain error codes
- DB Routing — read replicas for reports
- Lazy Imports — cross-app imports inside functions only
- FormService — centralized dynamic form builder
- PipelineService — centralized multi-step flow engine
- ApprovalService — maker-checker with dual control
- LedgerService — ACID double-entry with computed balances
- Database Design — partitioning, indexing, microservice-ready
- Error Codes — permanent codes by domain (AUTH_1xxx, ACC_2xxx, TXN_3xxx)
Frontend (React 19 + TypeScript)
- PageLayout, 2. CanAccess, 3. API Client (HMAC), 4. React Query,
- Correlation ID, 6. Money Component, 7. MaskedField (PII),
- ErrorBoundary, 9. DynamicForm, 10. PipelineTracker,
- ApprovalQueue, 12. DataTable, 13. AuthGuard (JWT in-memory),
- Feature Flags, 15. WebSocket, 16. i18n + RTL
Knowledge Base
50+ knowledge files covering:
- Banking Domain — 35 app definitions with models, services, endpoints
- Security — KEK/DEK encryption, HMAC signing, PCI DSS compliance, OWASP
- Performance — N+1 detection, caching rules, connection pooling
- Operations — Rollback strategy, multi-environment, webhook patterns
- Testing — 7-layer test guides, TDD enforcement, coverage strategy
- Architecture — API versioning, deadlock prevention, middleware stack
- Frontend — Auth patterns, component library, E2E testing
- Intelligence — Risk detection, learning feedback, deep analysis
Project Structure
~/.claude/skills/makhlouf/
├── SKILL.md # Main router (25 commands)
├── README.md # This file
├── INSTALL.md # Detailed installation guide
├── GUIDE.md # User guide with workflows
├── VERSION # 2.1.0
├── skills/ # 24 command skills
│ ├── makhlouf-new/
│ ├── makhlouf-execute/
│ ├── makhlouf-app/
│ ├── makhlouf-test/
│ └── ... (21 more)
├── agents/ # 13 AI agent prompts
│ ├── app-builder.md
│ ├── frontend-builder.md
│ ├── qa-smoke-agent.md
│ ├── security-tester.md
│ └── ... (9 more)
├── knowledge/ # 50+ knowledge files
│ ├── architecture-blueprint.md # THE blueprint (18 decisions)
│ ├── frontend-architecture.md # 16 frontend decisions
│ ├── banking-domain.md # 35 app definitions
│ ├── dependency-graph.md # 10-wave build order
│ ├── security-patterns.md
│ ├── pci-dss-compliance.md
│ ├── patterns/ # Domain patterns
│ └── testing/ # Test layer guides
├── templates/ # Code templates
│ ├── project/ # Project scaffold
│ └── app/ # App scaffold
├── bin/ # Shell scripts
│ ├── forge-lint.sh # Convention linter (9 rules)
│ ├── forge-state.sh # Build state management
│ └── forge-memory.sh # Memory management
└── tests/
└── test_plugin.sh # Plugin self-test (352+ tests)Requirements
- Claude Code (CLI, Desktop, or VS Code extension)
- Claude Opus or Sonnet model
- No other dependencies — the plugin is pure markdown
Updates
cd ~/.claude/skills/makhlouf
git pull origin mainPhase 13 Quickstart — Provider Abstraction + First LLM Call
Phase 13 delivers the minimal end-to-end chain for the standalone TypeScript
CLI (v3.0.0): a single agent can call an LLM via the provider abstraction,
use Phase 12 tools, and stream output to the terminal.
Prerequisites
- Node.js >= 20 (tested against Node 22)
- An Anthropic API key (get one at https://console.anthropic.com)
Install and build
npm install
npm run buildRun the smoke test
export ANTHROPIC_API_KEY=sk-ant-...
npm run smokeExpected output when the key is set:
SMOKE: starting — agent=app-builder model=claude-haiku-4-5
SMOKE: prompt="List three files in the current directory. Be brief."
---
⏺ app-builder (claude-haiku-4-5)
... streamed response ...
tokens: N in / M out • reason: stop
---
SMOKE: pass (reason=stop, tokens=Nin/Mout)If ANTHROPIC_API_KEY is not set, the smoke test prints
SKIP: ANTHROPIC_API_KEY not set and exits 0 — so CI and local runs
without a key stay green.
Run a chat command
# Default agent (app-builder), default model (from .makhloufrc or schema default)
npx tsx src/cli.ts chat "What tools do you have?"
# Override agent
npx tsx src/cli.ts chat -a fix-agent "Explain the last error"
# Override model
npx tsx src/cli.ts chat -m claude-opus-4-6 "Design a payments service"
# Verbose event logging (event types only — prompt content is never logged)
npx tsx src/cli.ts chat -v "List files"Run the opt-in integration test
export ANTHROPIC_API_KEY=sk-ant-...
npm run test:integrationThis hits the real Anthropic API with a trivial prompt and
claude-haiku-4-5 (cheapest valid model). It is skipped by default when
ANTHROPIC_API_KEY is not set, so the regular npm test suite never
incurs API costs.
Exit codes
| Code | Meaning |
|------|-----------------------------------------------------------------|
| 0 | Success |
| 1 | Missing ANTHROPIC_API_KEY or unknown agent |
| 2 | Provider error, config load error, or renderer crash |
| 3 | Tool loop cap hit (25 iterations) |
| 4 | Budget exceeded — history cannot be compacted below the ceiling |
| 130 | Aborted via SIGINT (Ctrl+C) |
Team Mode
Makhlouf v3.0.0 introduces a team agent orchestrator that runs a sequential 4-role pipeline — architect → builder → security → qa — with deterministic verification gates between handoffs. Each role reuses the single-agent executor with its own fresh context window, independent agent prompt, and configurable model. The command is invoked as makhlouf team "<task>" and streams role-scoped events to the terminal in real time.
What it does:
- Architect designs the change (default agent:
app-builder). Gate: design-consistency sanity check on the assistant text. - Builder implements the change (default agent:
extend-agent). Gate: auto-detectednpm run lint,npm run typecheck, andnpm testfrom yourpackage.json— only commands that actually exist are run, so a project with onlylintgets a one-command gate. - Security audits for PCI DSS and banking concerns (default agent:
security-tester). Gate: report-check (verifies the role wrote a non-empty artifact). - QA plans and generates tests (default agent:
test-master). Gate: report-check.
Between each role a verification gate runs. On gate failure, the current role is retried (up to 2 additional attempts for architect and builder; security and qa are read-only and do NOT retry). On retry exhaustion, the pipeline aborts with exit code 5 (team gate exhausted).
Session artifacts are written to .forge/team/<session-id>/ where <session-id> is <YYYYMMDD-HHmmss>-<6char hex>:
architect.md,builder.md,security.md,qa.md— per-role output with YAML frontmatter (role, agent, model, status, gate result, files touched) and the full assistant text. Each role's artifact is written BEFORE its gate runs, so read-only report-check gates for security/qa have a real file to inspect.summary.md— aggregated report with per-role status table, files touched, and verdict. ALWAYS written at pipeline end, whether the pipeline succeeded or aborted.
Per-role model assignment
You can run each role on a different model via .makhloufrc:
{
"team": {
"architect": { "agent": "app-builder", "model": "claude-opus-4-6" },
"builder": { "agent": "extend-agent", "model": "claude-sonnet-4-6" },
"security": { "agent": "security-tester", "model": "claude-opus-4-6" },
"qa": { "agent": "test-master", "model": "claude-haiku-4-5" }
}
}Or override per invocation via CLI flags:
# Default models from .makhloufrc (or baseConfig.model as the final fallback)
makhlouf team "Add wire transfer to accounts app"
# Override the builder's model for one run
makhlouf team "Fix balance rounding" --team-model-builder claude-opus-4-6
# Full manual assignment
makhlouf team "Audit payment flows" \
--team-model-architect claude-opus-4-6 \
--team-model-builder claude-sonnet-4-6 \
--team-model-security claude-opus-4-6 \
--team-model-qa claude-haiku-4-5The priority chain is:
cliOverrides.<role>.model > teamConfig.<role>.model > baseConfig.modelNote: The global --model flag is ignored when running makhlouf team. Use --team-model-<role> flags to override models per role.
Exit codes
Team mode uses the standard Makhlouf exit code set plus one new code:
| Code | Meaning | |------|-----------------------------------------------------------------| | 0 | Success (all 4 roles passed their gates) | | 1 | User error (missing API key, bad flags, unknown agent) | | 2 | Provider error, config load failure, or renderer crash | | 3 | Tool loop cap hit | | 4 | Context budget exceeded | | 5| Team gate exhausted — a verification gate failed after max retries | | 130 | Aborted via SIGINT (Ctrl+C) |
Session artifact retention
Session artifacts under .forge/team/ are kept after the session ends — never auto-deleted. The retention decision is intentional: you can inspect failed runs, replay artifacts through a reviewer, and keep an audit trail of what each role produced. If your project contains sensitive banking code, add .forge/team/ to .gitignore so artifacts never reach your git history:
# .gitignore
.forge/team/Users who want periodic cleanup can run makhlouf clean (existing command from v2.1.0) or delete old session directories manually.
Verbose logging and secrets
Like every Makhlouf command, makhlouf team --verbose logs to stderr using event TYPES only — never the userPrompt, assistant text, or gate stderr. The verbose stream prints lines like [verbose] team:role-start(architect) and [verbose] team:role-event(builder:text-delta) so you can trace pipeline progress in CI logs without leaking any content.
Context Budget
Makhlouf tracks token usage per model and enforces a configurable ceiling (default: 70% of the model's context window) to leave room for the model's own reasoning and tool output. When the ceiling is approached, the CLI auto-compacts the conversation trail while preserving the pinned knowledge block byte-for-byte — your 50+ markdown knowledge files survive every compaction cycle untouched.
Ceiling configuration
In .makhloufrc:
{
"preferences": {
"contextCeiling": 0.7
}
}The contextCeiling is a FRACTION between 0.1 and 1.0 (default 0.7).
It is multiplied by the active model's context length. Example: on a
1,000,000-token Sonnet 4.6 session, the absolute ceiling is 700_000 tokens.
Supported models and context lengths
| Model | Context window | Tokenizer | |---------------------|---------------:|---------------------| | claude-opus-4-6 | 1,000,000 | Anthropic server | | claude-sonnet-4-6 | 1,000,000 | Anthropic server | | claude-haiku-4-5 | 200,000 | Anthropic server | | claude-opus-4-5 | 200,000 | Anthropic server | | claude-sonnet-4-5 | 200,000 | Anthropic server | | claude-opus-4-1 | 200,000 | Anthropic server | | gpt-4o / gpt-4o-mini | 128,000 | tiktoken o200k_base | | llama3.1 / llama3.2 | 128,000 | tiktoken o200k_base | | (unknown model) | 32,000 | tiktoken o200k_base (with stderr warning) |
Unknown models fall back to a conservative 32,000-token ceiling AND emit
a one-shot warning to stderr. Add new entries to
src/context/model-caps.ts to promote a model to its true context length.
Compaction behavior
When token usage meets or exceeds the ceiling between agent turns, Makhlouf:
- Counts the pinned knowledge block ONCE at session start (via the
provider's
countTokensfor Claude,js-tiktoken o200k_basefor others). - Segments the message trail into atomic
MessageGroups so tool-call and tool-result messages are never split across a compaction boundary. - Builds a deterministic structural summary (no LLM call — just turn counts, tool histograms, and the last user question truncated to 80 chars) and replaces older groups with the summary.
- Preserves the last
max(4, floor(ceiling / 4000))message groups verbatim so the active tool loop is never interrupted. - Emits a
[compacted: A → B (-N groups)]line to stdout so you know history was rewritten.
LLM-based compaction (having the model summarize its own history) is deliberately deferred to v3.1 — the deterministic structural summary keeps Phase 14 reproducible, free of cost, and testable without mocking the model.
Phase 14 limitation: compaction fires between agent calls
Compaction runs between provider.stream calls, not inside a single
streaming response. If a single provider call's output exceeds the
budget (because the model generated a very large response or a tool
returned a huge payload), Makhlouf cannot compact mid-stream — it yields
a budget-exceeded error and exits with code 4. v3.1 may refactor
this to an internal tool loop that can compact at any step boundary.
On exit code 4, the hint is: try a model with a larger context window or
split your task into smaller prompts.
Verbose budget footer
Run with --verbose to see a dim footer after each assistant turn:
[ctx: 12.3k/700.0k (2%) • model: claude-sonnet-4-6]Or enable it permanently via .makhloufrc (Phase 15 will wire this):
{ "preferences": { "showBudget": true } }The [compacted: A → B (-N groups)] compaction line is ALWAYS shown,
regardless of the verbose flag, because rewriting history is a visible
action users must know about.
Phase 13 scope and known gaps
In scope (delivered):
LLMProviderinterface + Anthropic implementation viaai@6+@ai-sdk/anthropic- Single-agent executor composing Phase 12 building blocks
- Streaming terminal renderer with dim tool-call lines, spinner, and usage footer
- Minimal
makhlouf chat <prompt>CLI - Unit tests + opt-in integration test
Out of scope (deferred):
- OpenRouter and Ollama providers → Phase 17
- Model aliasing (
--model fast/smart) → Phase 17 - Context budget / auto-compaction → Phase 14
- Full Commander command tree (all 25
makhlouf-*commands) → Phase 15 - Team orchestration → Phase 16
- Session save/restore → Phase 17
- Retry/backoff on transient errors → Phase 17
Known gaps:
- T-12-07 / T-13-03 path-traversal: Phase 12 deferred path normalization
for
filePatch/fileWritetool calls, and Phase 13 inherits this gap. Tracked for Phase 17 hardening. Banking repos should use theallowedtool category filter in.makhloufrcto scope what the agent can touch.
Providers
Makhlouf v3.0.0 supports three LLM providers out of the box, all sharing a common
LLMProvider interface so agents, tool loops, budget tracking, session persistence,
and retry logic work identically regardless of which upstream API is called.
| Provider | Env var | Dependencies | Tool calls | Notes |
|-----------|---------------------|------------------------------|------------|----------------------------------------|
| Anthropic | ANTHROPIC_API_KEY | ai@6 + @ai-sdk/anthropic | yes | Default. Server-side token counting. |
| OpenRouter| OPENROUTER_API_KEY| direct fetch (zero deps) | yes | 300+ models, OpenAI-compatible SSE. |
| Ollama | (none) | direct fetch (zero deps) | model-dep | Local daemon, NDJSON streaming, no key.|
Anthropic Claude (default)
Anthropic is the default provider and is wired through the official @ai-sdk/anthropic
adapter. The provider reads ANTHROPIC_API_KEY from the environment automatically at
call time — Makhlouf's own code never touches the env var.
export ANTHROPIC_API_KEY=sk-ant-...
makhlouf chat "Design a payments service"Configure in .makhloufrc:
{
"provider": "anthropic",
"model": "claude-sonnet-4-6"
}Supported models include claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5,
claude-opus-4-5, claude-sonnet-4-5, and claude-opus-4-1. Sonnet 4.6 and
Opus 4.6 carry a 1,000,000-token context window; see "Context Budget" above.
OpenRouter
OpenRouter exposes 300+ models through a single OpenAI-compatible API. Sign up at https://openrouter.ai to get an API key (free tier available for common models).
export OPENROUTER_API_KEY=sk-or-...
makhlouf chat "Analyze the wire-transfer flow" --model cheapConfigure in .makhloufrc:
{
"provider": "openrouter",
"model": "meta-llama/llama-3.3-70b-instruct"
}Popular models:
meta-llama/llama-3.3-70b-instruct— 131,072-token context, tool-capable (free tier available via:freesuffix)mistralai/mistral-large-2407— 128,000-token context, tool-capableopenai/gpt-4o/openai/gpt-4o-mini— 128,000-token context, tool-capable
See https://openrouter.ai/models for the full catalog. Any model ID that OpenRouter
accepts works here — entries in src/context/model-caps.ts promote common models
to their correct context length; unknown models fall back to a conservative 32k
ceiling with a one-shot stderr warning.
Ollama (local)
Ollama runs LLMs locally on your machine — no API key, no
network, no per-token cost. Install with brew install ollama (macOS) or from
https://ollama.com, then start the daemon:
ollama serve # in a separate terminal
ollama pull llama3.2 # download a tool-capable model
makhlouf chat "Say hello in one word." --model localConfigure in .makhloufrc:
{
"provider": "ollama",
"model": "llama3.2"
}Override the daemon endpoint with OLLAMA_HOST:
# Remote Ollama daemon on a workstation
export OLLAMA_HOST=http://192.168.1.50:11434
makhlouf chat "test" --model localThe default endpoint is http://127.0.0.1:11434 (IPv4 literal, not localhost, to
avoid ::1 resolution failures on IPv4-only Ollama deployments).
Tool calling support. Ollama's tool-calling support is model-dependent. Known
tool-capable models: llama3.1, llama3.1:70b, llama3.2, llama3.2:3b,
qwen2.5, qwen2.5-coder, qwen3, devstral, llama4. If you run Makhlouf
against a non-tool-capable model (e.g. mistral, phi3), the provider silently
omits the tools field from the request body — the agent can still generate text
but will not receive tool calls. Use llama3.2 for a good default local coding
experience.
ECONNREFUSED detection. If the Ollama daemon is not running, Makhlouf surfaces an actionable error:
error: ollama: connection refused at http://127.0.0.1:11434 — is `ollama serve` running? Set OLLAMA_HOST to point at a different endpoint.This is a terminal error — the retry wrapper recognizes the message prefix and will not waste quota retrying a downed daemon.
Retry wrapper
All three providers are wrapped uniformly with an exponential-backoff retry layer
(withRetry) applied inside the createProvider factory. Retry triggers on
HTTP 408/429/500/502/503/504 and common network errors (ECONNRESET, ETIMEDOUT,
ENOTFOUND, EAI_AGAIN). It fails fast on HTTP 400/401/403/404, user aborts
(AbortSignal), and Ollama ECONNREFUSED. The classifier inspects the error string
from the provider's event stream — there is no bypass code path.
Defaults: { base: 1000, maxAttempts: 4, maxDelay: 30000 }. Override in
.makhloufrc:
{
"retry": {
"base": 500,
"maxAttempts": 6,
"maxDelay": 60000
}
}maxAttempts is bounded to [1, 10] and maxDelay to [0, 600000] (10 min) by
the Zod schema, so misuse is bounded at the config layer. Setting maxAttempts: 1
disables retry entirely (one initial try, zero retries). The backoff formula is
min(maxDelay, base * 2^attempt) + Math.random() * 1000ms — the constant-jitter
floor prevents thundering-herd retry storms when multiple clients rate-limit
simultaneously.
Model Aliases
Aliases let you type a short, memorable name (fast, cheap, local) instead of
the full provider + model tuple every time. Aliases are resolved at config-load
time, so --model cheap on the CLI and "model": "cheap" in .makhloufrc both
work identically.
Built-in aliases
| Alias | Provider | Model |
|------------|------------|--------------------------------------|
| fast | anthropic | claude-haiku-4-5 |
| smart | anthropic | claude-opus-4-6 |
| balanced | anthropic | claude-sonnet-4-6 |
| cheap | openrouter | meta-llama/llama-3.3-70b-instruct |
| local | ollama | llama3.2 |
Use them on the command line:
makhlouf chat "Explain double-entry bookkeeping" --model fast
makhlouf chat "Design a microservice boundary" --model smart
makhlouf chat "Translate these error messages" --model cheap
makhlouf chat "Summarize this PR" --model localOr pin one as the default in .makhloufrc:
{
"model": "balanced"
}When an alias resolves, Makhlouf rewrites both config.model and
config.provider atomically — so the cross-provider aliases (cheap → OpenRouter,
local → Ollama) pick up the correct env-var requirement automatically. If the
required env var is missing (e.g. cheap needs OPENROUTER_API_KEY), the provider
emits a clean error event with the exact env var name.
User-defined aliases
Extend or override aliases in .makhloufrc:
{
"aliases": {
"production": {
"model": "claude-opus-4-6",
"provider": "anthropic"
},
"experimental": {
"model": "openai/gpt-4o",
"provider": "openrouter"
},
"fast": {
"model": "llama3.2",
"provider": "ollama"
}
}
}Resolution order:
- User aliases (from
config.aliases) win first. - Built-in aliases fall through when the name is not in user config.
- Unknown names pass through unchanged as literal model IDs.
The last rule means an unrecognized name is treated as a concrete model
identifier — so --model claude-opus-4-6 still works without being defined as an
alias. Overriding a built-in (e.g. redefining fast to point at Ollama) is a
quiet override: no warning, no log line. It is an intentional opt-in.
No recursion. Alias resolution is a single pass. If a user alias's model
field happens to match another alias name, the resolver returns the LITERAL
string — it does not recurse. This prevents stack overflows on circular
references and keeps behavior deterministic.
Session Persistence
Save a chat session to disk and resume it later with full conversation context, system prompt, knowledge files, and token budget preserved byte-for-byte.
Save a session
Add --save <name> to any makhlouf chat call. The session is written after the
turn completes successfully (never on error, loop-cap, or abort):
makhlouf chat "Analyze the payment flow in the wire transfer app" --save analysis-1After the turn finishes, Makhlouf writes
<cwd>/.makhlouf/sessions/analysis-1.json containing:
- The full conversation history (user + assistant + tool-call + tool-result turns)
- The agent name used
- The complete resolved
.makhloufrcconfig (provider, model, tool policy, retry config) - The
BudgetTrackersnapshot (pinnedTokens+trailTokensfrom Phase 14) - A session UUID, ISO-8601 save timestamp, and schema version
Writes are atomic via temp + rename so a SIGKILL mid-save leaves the prior
snapshot intact. Sessions over 10 MB emit a one-shot stderr warning but are not
blocked — v3.0.0 treats session size as a user decision.
Resume a session
Add --resume <name> to continue a saved conversation. The new prompt becomes
the next turn in the existing conversation:
makhlouf chat "Now audit the wire transfer path" --resume analysis-1On resume, Makhlouf:
- Loads the snapshot and validates it against the Zod schema (rejects unknown fields from future-format files).
- Overrides the runtime config with the snapshot's recorded config — so the same
model, provider, agent, and tool policy are used. (A v3.1
--allow-model-overrideflag is planned.) - Instantiates a new
BudgetTracker(model, caps, ceiling), pre-populates it viasetPinned(snapshot.pinnedTokens)+setTrail(snapshot.trailTokens), and passes it to the executor. Phase 14's pre-populated short-circuit skips re-counting the knowledge block. - Prepends
snapshot.messagesto the new user prompt so the provider sees the full history. - Runs the turn and streams to the terminal as usual.
Save and resume in one invocation
Compose the flags to continue from one session into a new one:
makhlouf chat "Summarize findings into a report" --resume analysis-1 --save analysis-finalThe state machine is strict: load-first / run / save-last. If the load fails the CLI exits with code 2 before rendering a single token. If the turn fails (loop-cap-hit, budget-exceeded, provider error, user abort), the save step is suppressed so partial streams never produce snapshots. If the save itself fails after a successful render, the CLI exits with code 2 and prints an actionable stderr message.
Error cases
SessionNotFoundError— the named session does not exist. The error carries anavailableSessionslist so the CLI can show "did you mean X?" without a second directory walk.SessionReadError— the file exists but is corrupt JSON or fails Zod shape validation. Prints the file path and the Zod error.SessionVersionError— the file was written by a newer CLI version. CarriesfoundVersion+expectedVersionfields so the migration path is obvious.
Storage location and name rules
Sessions live under <cwd>/.makhlouf/sessions/<name>.json. The cwd is the
directory where makhlouf chat was invoked — each project gets its own session
namespace naturally.
Session names are restricted to ^[a-zA-Z0-9_-]+$ — no path separators, no dots,
no spaces, no shell metacharacters. Path traversal (.., /, \, null bytes,
leading dot) is rejected at the sanitization gate. The same regex is enforced on
both save and load paths as defence-in-depth, so an attacker who edits the
sessions directory directly cannot trigger path traversal via a later --resume.
Add .makhlouf/ to .gitignore
Recommended. Sessions may contain sensitive banking code, customer data, or
API conversation history — they are not encrypted at rest in v3.0.0. Add the
directory to your project's .gitignore:
# .gitignore
.makhlouf/Session encryption at rest is planned for v3.1.
Listing sessions
The session store exposes a listSessions(cwd) function, and future v3.1 work
will surface it via makhlouf sessions list / delete / rename. For v3.0.0,
use the filesystem directly:
ls .makhlouf/sessions/
cat .makhlouf/sessions/analysis-1.json | jq '.agent, .savedAt, (.messages | length)'License
MIT License — see LICENSE
Author
Ahmed Makhlouf — Building the future of banking software, one command at a time.
Makhlouf v3.0.0 — 60,000+ lines of banking domain expertise, multi-provider LLM support, session persistence
