npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

makhlouf

v3.1.0

Published

Standalone TypeScript CLI that acts as a senior banking software architect

Readme

Makhlouf v3.0.0

The most comprehensive standalone CLI for building enterprise Django banking systems with any LLM provider.

Created by Ahmed Makhlouf — 60,000+ lines of banking domain expertise, 25 commands, 13 AI agents, 50+ knowledge files, multi-provider LLM support (Anthropic, OpenRouter, Ollama), and local session persistence.


What is Makhlouf?

Makhlouf is a Claude Code plugin that turns Claude into a senior banking software architect. It knows how to build PCI DSS-compliant, enterprise-grade Django banking systems from scratch — with proper ACID ledger engines, maker-checker approvals, encrypted PII, multi-tenant IAM, and 35 banking domain apps.

One command to scaffold. One command to build. One command to ship.

Key Features

| Category | What You Get | |----------|-------------| | 25 Commands | Full lifecycle: scaffold → build → test → QA → audit → ship | | 13 AI Agents | Parallel app builders, QA agents, security testers, persona simulators | | 18 Backend Decisions | Centralized from Day 1: BaseModel, IAM, Ledger, Pipeline, Approval, Forms | | 16 Frontend Decisions | React 19 + TypeScript + Tailwind 4 + shadcn/ui with HMAC signing | | 35 Banking Apps | Accounts, transfers, cards, loans, FX, fees, compliance, KYC, and more | | 10-Wave Build | Dependency-ordered parallel construction with contract verification | | 7-Layer Tests | Unit → service → integration → contract → security → performance → chaos | | 352+ Tests | Plugin self-test suite covering all components | | Cross-Session Memory | Decisions persist across Claude sessions via .forge/intelligence/ | | Project Style Adaptation | Learns and applies project-specific coding conventions | | Self-Learning | Risk heatmap, coupling map, predictions — gets smarter each build | | PCI DSS Level 1 | 12 requirements mapped to code checks | | Multi-Persona Sim | 50+ AI actors playing merchant/teller/admin/compliance roles |

Commands

Phase 15 ports all 25 makhlouf-* commands from the v2.1.0 plugin into the standalone CLI. After setting ANTHROPIC_API_KEY, run any of the commands below (plus chat as the raw escape hatch). Run makhlouf <command> --help for per-command flags and examples.

Build

  • makhlouf new — Initialize a new Django banking project from scratch
  • makhlouf execute — Build ALL apps wave-by-wave with parallel agents
  • makhlouf app — Build a single Django app from scratch with TDD
  • makhlouf feature — Build a feature spanning multiple Django apps end-to-end
  • makhlouf extend — Add a new feature to ONE existing Django app
  • makhlouf frontend — Build React frontend components for a Django banking app
  • makhlouf wire — Connect Django backend to React frontend with typed clients

Fix and quality

  • makhlouf fix — Fix a bug with multi-agent diagnosis and parallel fix attempts
  • makhlouf test — Run the automated 7-layer test suite (code-level, minutes)
  • makhlouf qa — Exhaustive browser E2E, API smoke, and visual regression QA
  • makhlouf audit — Security audit: permissions, encryption, PII, compliance
  • makhlouf clean — Detect code smells: duplicates, dead code, orphan migrations
  • makhlouf refactor — Execute structural code changes: split apps, move models

Ship and observe

  • makhlouf status — Show Makhlouf build progress dashboard
  • makhlouf scan — Rebuild .forge/contracts/ from live code after manual edits
  • makhlouf migrate — Safe migration management with zero-downtime patterns
  • makhlouf ship — Run full test suite, generate report, and create a pull request
  • makhlouf docs — Auto-generate API docs and architecture diagrams from live code
  • makhlouf run — Run the banking system locally and verify service health

Intelligence and maintenance

  • makhlouf simulate — Multi-persona business simulation across merchant, teller, admin
  • makhlouf study — Deep-read your own codebase to build contracts and conventions
  • makhlouf learn — Study an external codebase and compare it against your project
  • makhlouf integrate — Replace stub providers with real external services (Twilio, etc.)
  • makhlouf test-report — Show last test results with ship-readiness verdict
  • makhlouf update — Update Makhlouf plugin to the latest version from GitHub

Escape hatch

  • makhlouf chat "<prompt>" — Run a single agent with any prompt (bypasses the command catalog, from Phase 13)

Non-interactive mode

Pass --no-interactive to suppress the spinner and ANSI colors for CI-friendly output:

makhlouf --no-interactive status
CI=1 makhlouf status   # auto-detected when CI env is set or stdout is not a TTY

The flag keeps the raw token stream, tool call lines, and usage footer intact so CI logs remain meaningful. It is globally applicable and propagates to every catalog command plus chat.

Quick Start

1. Install

# Clone the plugin
git clone https://github.com/ahmedmk/makhlouf.git ~/.claude/skills/makhlouf

# Create symlinks for Claude Code discovery
cd ~/.claude/skills
for skill in $(ls -d makhlouf/skills/makhlouf-*/); do
  ln -sfn "$skill" "$(basename $skill)"
done

# Make scripts executable
chmod +x ~/.claude/skills/makhlouf/bin/*.sh

2. Start a New Session

# Open Claude Code in any directory
claude

# Initialize a new banking project
/makhlouf-new

# Build all apps (parallel agents)
/makhlouf-execute

# Run tests
/makhlouf-test

# Ship it
/makhlouf-ship

3. Learn an Existing Codebase

# Point at any Django project
cd /path/to/existing-project
/makhlouf-study

# Now all commands work with context
/makhlouf-extend   # add features
/makhlouf-fix      # fix bugs
/makhlouf-audit    # security check

4. Try the v3.0.0 standalone CLI with any provider

Makhlouf v3.0.0 ships as a standalone npm-installable CLI with built-in support for Anthropic Claude, OpenRouter, and local Ollama. After npm install && npm run build (or npx tsx src/cli.ts for development), the five built-in aliases are ready to use out of the box:

# Anthropic Claude (requires ANTHROPIC_API_KEY)
makhlouf chat "Explain the double-entry ledger service" --model fast      # claude-haiku-4-5
makhlouf chat "Design a payments microservice"          --model smart     # claude-opus-4-6
makhlouf chat "Audit the approval workflow"             --model balanced  # claude-sonnet-4-6

# OpenRouter (requires OPENROUTER_API_KEY; free tier available)
makhlouf chat "Translate error messages to Spanish"     --model cheap     # llama-3.3-70b-instruct

# Ollama (requires `ollama serve` running locally; no key)
makhlouf chat "Summarize this commit"                   --model local     # llama3.2

Save a conversation to resume it later:

makhlouf chat "Remember my project is a banking system" --save context-1
makhlouf chat "What architecture did we choose?"        --resume context-1

See the Providers, Model Aliases, and Session Persistence sections below for the full reference.

Architecture

Makhlouf enforces 18 centralized backend decisions and 16 frontend decisions from Day 1:

Backend (Django)

  1. BaseModel — UUID v7, timestamps, soft delete for ALL models
  2. HasIAMPermission — centralized auth, iam_resource = model_name_plural
  3. Middleware Stack — 15-layer security stack in exact order
  4. Correlation ID — every request, service, task, log entry
  5. Audit Logginglog_action() on every mutation
  6. API Envelopeapi_response() for all endpoints
  7. Encryption — KEK/DEK for all PII fields
  8. SystemConfig — no hardcoded values, ever
  9. Celery Queues — 5 queues with correlation ID propagation
  10. Error Handler — centralized with domain error codes
  11. DB Routing — read replicas for reports
  12. Lazy Imports — cross-app imports inside functions only
  13. FormService — centralized dynamic form builder
  14. PipelineService — centralized multi-step flow engine
  15. ApprovalService — maker-checker with dual control
  16. LedgerService — ACID double-entry with computed balances
  17. Database Design — partitioning, indexing, microservice-ready
  18. Error Codes — permanent codes by domain (AUTH_1xxx, ACC_2xxx, TXN_3xxx)

Frontend (React 19 + TypeScript)

  1. PageLayout, 2. CanAccess, 3. API Client (HMAC), 4. React Query,
  2. Correlation ID, 6. Money Component, 7. MaskedField (PII),
  3. ErrorBoundary, 9. DynamicForm, 10. PipelineTracker,
  4. ApprovalQueue, 12. DataTable, 13. AuthGuard (JWT in-memory),
  5. Feature Flags, 15. WebSocket, 16. i18n + RTL

Knowledge Base

50+ knowledge files covering:

  • Banking Domain — 35 app definitions with models, services, endpoints
  • Security — KEK/DEK encryption, HMAC signing, PCI DSS compliance, OWASP
  • Performance — N+1 detection, caching rules, connection pooling
  • Operations — Rollback strategy, multi-environment, webhook patterns
  • Testing — 7-layer test guides, TDD enforcement, coverage strategy
  • Architecture — API versioning, deadlock prevention, middleware stack
  • Frontend — Auth patterns, component library, E2E testing
  • Intelligence — Risk detection, learning feedback, deep analysis

Project Structure

~/.claude/skills/makhlouf/
├── SKILL.md              # Main router (25 commands)
├── README.md             # This file
├── INSTALL.md            # Detailed installation guide
├── GUIDE.md              # User guide with workflows
├── VERSION               # 2.1.0
├── skills/               # 24 command skills
│   ├── makhlouf-new/
│   ├── makhlouf-execute/
│   ├── makhlouf-app/
│   ├── makhlouf-test/
│   └── ... (21 more)
├── agents/               # 13 AI agent prompts
│   ├── app-builder.md
│   ├── frontend-builder.md
│   ├── qa-smoke-agent.md
│   ├── security-tester.md
│   └── ... (9 more)
├── knowledge/            # 50+ knowledge files
│   ├── architecture-blueprint.md    # THE blueprint (18 decisions)
│   ├── frontend-architecture.md     # 16 frontend decisions
│   ├── banking-domain.md            # 35 app definitions
│   ├── dependency-graph.md          # 10-wave build order
│   ├── security-patterns.md
│   ├── pci-dss-compliance.md
│   ├── patterns/                    # Domain patterns
│   └── testing/                     # Test layer guides
├── templates/            # Code templates
│   ├── project/          # Project scaffold
│   └── app/              # App scaffold
├── bin/                  # Shell scripts
│   ├── forge-lint.sh     # Convention linter (9 rules)
│   ├── forge-state.sh    # Build state management
│   └── forge-memory.sh   # Memory management
└── tests/
    └── test_plugin.sh    # Plugin self-test (352+ tests)

Requirements

  • Claude Code (CLI, Desktop, or VS Code extension)
  • Claude Opus or Sonnet model
  • No other dependencies — the plugin is pure markdown

Updates

cd ~/.claude/skills/makhlouf
git pull origin main

Phase 13 Quickstart — Provider Abstraction + First LLM Call

Phase 13 delivers the minimal end-to-end chain for the standalone TypeScript CLI (v3.0.0): a single agent can call an LLM via the provider abstraction, use Phase 12 tools, and stream output to the terminal.

Prerequisites

  • Node.js >= 20 (tested against Node 22)
  • An Anthropic API key (get one at https://console.anthropic.com)

Install and build

npm install
npm run build

Run the smoke test

export ANTHROPIC_API_KEY=sk-ant-...
npm run smoke

Expected output when the key is set:

SMOKE: starting — agent=app-builder model=claude-haiku-4-5
SMOKE: prompt="List three files in the current directory. Be brief."
---
⏺ app-builder (claude-haiku-4-5)
... streamed response ...

tokens: N in / M out • reason: stop
---
SMOKE: pass (reason=stop, tokens=Nin/Mout)

If ANTHROPIC_API_KEY is not set, the smoke test prints SKIP: ANTHROPIC_API_KEY not set and exits 0 — so CI and local runs without a key stay green.

Run a chat command

# Default agent (app-builder), default model (from .makhloufrc or schema default)
npx tsx src/cli.ts chat "What tools do you have?"

# Override agent
npx tsx src/cli.ts chat -a fix-agent "Explain the last error"

# Override model
npx tsx src/cli.ts chat -m claude-opus-4-6 "Design a payments service"

# Verbose event logging (event types only — prompt content is never logged)
npx tsx src/cli.ts chat -v "List files"

Run the opt-in integration test

export ANTHROPIC_API_KEY=sk-ant-...
npm run test:integration

This hits the real Anthropic API with a trivial prompt and claude-haiku-4-5 (cheapest valid model). It is skipped by default when ANTHROPIC_API_KEY is not set, so the regular npm test suite never incurs API costs.

Exit codes

| Code | Meaning | |------|-----------------------------------------------------------------| | 0 | Success | | 1 | Missing ANTHROPIC_API_KEY or unknown agent | | 2 | Provider error, config load error, or renderer crash | | 3 | Tool loop cap hit (25 iterations) | | 4 | Budget exceeded — history cannot be compacted below the ceiling | | 130 | Aborted via SIGINT (Ctrl+C) |

Team Mode

Makhlouf v3.0.0 introduces a team agent orchestrator that runs a sequential 4-role pipeline — architectbuildersecurityqa — with deterministic verification gates between handoffs. Each role reuses the single-agent executor with its own fresh context window, independent agent prompt, and configurable model. The command is invoked as makhlouf team "<task>" and streams role-scoped events to the terminal in real time.

What it does:

  1. Architect designs the change (default agent: app-builder). Gate: design-consistency sanity check on the assistant text.
  2. Builder implements the change (default agent: extend-agent). Gate: auto-detected npm run lint, npm run typecheck, and npm test from your package.json — only commands that actually exist are run, so a project with only lint gets a one-command gate.
  3. Security audits for PCI DSS and banking concerns (default agent: security-tester). Gate: report-check (verifies the role wrote a non-empty artifact).
  4. QA plans and generates tests (default agent: test-master). Gate: report-check.

Between each role a verification gate runs. On gate failure, the current role is retried (up to 2 additional attempts for architect and builder; security and qa are read-only and do NOT retry). On retry exhaustion, the pipeline aborts with exit code 5 (team gate exhausted).

Session artifacts are written to .forge/team/<session-id>/ where <session-id> is <YYYYMMDD-HHmmss>-<6char hex>:

  • architect.md, builder.md, security.md, qa.md — per-role output with YAML frontmatter (role, agent, model, status, gate result, files touched) and the full assistant text. Each role's artifact is written BEFORE its gate runs, so read-only report-check gates for security/qa have a real file to inspect.
  • summary.md — aggregated report with per-role status table, files touched, and verdict. ALWAYS written at pipeline end, whether the pipeline succeeded or aborted.

Per-role model assignment

You can run each role on a different model via .makhloufrc:

{
  "team": {
    "architect": { "agent": "app-builder", "model": "claude-opus-4-6" },
    "builder":   { "agent": "extend-agent", "model": "claude-sonnet-4-6" },
    "security":  { "agent": "security-tester", "model": "claude-opus-4-6" },
    "qa":        { "agent": "test-master", "model": "claude-haiku-4-5" }
  }
}

Or override per invocation via CLI flags:

# Default models from .makhloufrc (or baseConfig.model as the final fallback)
makhlouf team "Add wire transfer to accounts app"

# Override the builder's model for one run
makhlouf team "Fix balance rounding" --team-model-builder claude-opus-4-6

# Full manual assignment
makhlouf team "Audit payment flows" \
  --team-model-architect claude-opus-4-6 \
  --team-model-builder claude-sonnet-4-6 \
  --team-model-security claude-opus-4-6 \
  --team-model-qa claude-haiku-4-5

The priority chain is:

cliOverrides.<role>.model  >  teamConfig.<role>.model  >  baseConfig.model

Note: The global --model flag is ignored when running makhlouf team. Use --team-model-<role> flags to override models per role.

Exit codes

Team mode uses the standard Makhlouf exit code set plus one new code:

| Code | Meaning | |------|-----------------------------------------------------------------| | 0 | Success (all 4 roles passed their gates) | | 1 | User error (missing API key, bad flags, unknown agent) | | 2 | Provider error, config load failure, or renderer crash | | 3 | Tool loop cap hit | | 4 | Context budget exceeded | | 5| Team gate exhausted — a verification gate failed after max retries | | 130 | Aborted via SIGINT (Ctrl+C) |

Session artifact retention

Session artifacts under .forge/team/ are kept after the session ends — never auto-deleted. The retention decision is intentional: you can inspect failed runs, replay artifacts through a reviewer, and keep an audit trail of what each role produced. If your project contains sensitive banking code, add .forge/team/ to .gitignore so artifacts never reach your git history:

# .gitignore
.forge/team/

Users who want periodic cleanup can run makhlouf clean (existing command from v2.1.0) or delete old session directories manually.

Verbose logging and secrets

Like every Makhlouf command, makhlouf team --verbose logs to stderr using event TYPES only — never the userPrompt, assistant text, or gate stderr. The verbose stream prints lines like [verbose] team:role-start(architect) and [verbose] team:role-event(builder:text-delta) so you can trace pipeline progress in CI logs without leaking any content.

Context Budget

Makhlouf tracks token usage per model and enforces a configurable ceiling (default: 70% of the model's context window) to leave room for the model's own reasoning and tool output. When the ceiling is approached, the CLI auto-compacts the conversation trail while preserving the pinned knowledge block byte-for-byte — your 50+ markdown knowledge files survive every compaction cycle untouched.

Ceiling configuration

In .makhloufrc:

{
  "preferences": {
    "contextCeiling": 0.7
  }
}

The contextCeiling is a FRACTION between 0.1 and 1.0 (default 0.7). It is multiplied by the active model's context length. Example: on a 1,000,000-token Sonnet 4.6 session, the absolute ceiling is 700_000 tokens.

Supported models and context lengths

| Model | Context window | Tokenizer | |---------------------|---------------:|---------------------| | claude-opus-4-6 | 1,000,000 | Anthropic server | | claude-sonnet-4-6 | 1,000,000 | Anthropic server | | claude-haiku-4-5 | 200,000 | Anthropic server | | claude-opus-4-5 | 200,000 | Anthropic server | | claude-sonnet-4-5 | 200,000 | Anthropic server | | claude-opus-4-1 | 200,000 | Anthropic server | | gpt-4o / gpt-4o-mini | 128,000 | tiktoken o200k_base | | llama3.1 / llama3.2 | 128,000 | tiktoken o200k_base | | (unknown model) | 32,000 | tiktoken o200k_base (with stderr warning) |

Unknown models fall back to a conservative 32,000-token ceiling AND emit a one-shot warning to stderr. Add new entries to src/context/model-caps.ts to promote a model to its true context length.

Compaction behavior

When token usage meets or exceeds the ceiling between agent turns, Makhlouf:

  1. Counts the pinned knowledge block ONCE at session start (via the provider's countTokens for Claude, js-tiktoken o200k_base for others).
  2. Segments the message trail into atomic MessageGroups so tool-call and tool-result messages are never split across a compaction boundary.
  3. Builds a deterministic structural summary (no LLM call — just turn counts, tool histograms, and the last user question truncated to 80 chars) and replaces older groups with the summary.
  4. Preserves the last max(4, floor(ceiling / 4000)) message groups verbatim so the active tool loop is never interrupted.
  5. Emits a [compacted: A → B (-N groups)] line to stdout so you know history was rewritten.

LLM-based compaction (having the model summarize its own history) is deliberately deferred to v3.1 — the deterministic structural summary keeps Phase 14 reproducible, free of cost, and testable without mocking the model.

Phase 14 limitation: compaction fires between agent calls

Compaction runs between provider.stream calls, not inside a single streaming response. If a single provider call's output exceeds the budget (because the model generated a very large response or a tool returned a huge payload), Makhlouf cannot compact mid-stream — it yields a budget-exceeded error and exits with code 4. v3.1 may refactor this to an internal tool loop that can compact at any step boundary.

On exit code 4, the hint is: try a model with a larger context window or split your task into smaller prompts.

Verbose budget footer

Run with --verbose to see a dim footer after each assistant turn:

[ctx: 12.3k/700.0k (2%) • model: claude-sonnet-4-6]

Or enable it permanently via .makhloufrc (Phase 15 will wire this):

{ "preferences": { "showBudget": true } }

The [compacted: A → B (-N groups)] compaction line is ALWAYS shown, regardless of the verbose flag, because rewriting history is a visible action users must know about.

Phase 13 scope and known gaps

In scope (delivered):

  • LLMProvider interface + Anthropic implementation via ai@6 + @ai-sdk/anthropic
  • Single-agent executor composing Phase 12 building blocks
  • Streaming terminal renderer with dim tool-call lines, spinner, and usage footer
  • Minimal makhlouf chat <prompt> CLI
  • Unit tests + opt-in integration test

Out of scope (deferred):

  • OpenRouter and Ollama providers → Phase 17
  • Model aliasing (--model fast/smart) → Phase 17
  • Context budget / auto-compaction → Phase 14
  • Full Commander command tree (all 25 makhlouf-* commands) → Phase 15
  • Team orchestration → Phase 16
  • Session save/restore → Phase 17
  • Retry/backoff on transient errors → Phase 17

Known gaps:

  • T-12-07 / T-13-03 path-traversal: Phase 12 deferred path normalization for filePatch / fileWrite tool calls, and Phase 13 inherits this gap. Tracked for Phase 17 hardening. Banking repos should use the allowed tool category filter in .makhloufrc to scope what the agent can touch.

Providers

Makhlouf v3.0.0 supports three LLM providers out of the box, all sharing a common LLMProvider interface so agents, tool loops, budget tracking, session persistence, and retry logic work identically regardless of which upstream API is called.

| Provider | Env var | Dependencies | Tool calls | Notes | |-----------|---------------------|------------------------------|------------|----------------------------------------| | Anthropic | ANTHROPIC_API_KEY | ai@6 + @ai-sdk/anthropic | yes | Default. Server-side token counting. | | OpenRouter| OPENROUTER_API_KEY| direct fetch (zero deps) | yes | 300+ models, OpenAI-compatible SSE. | | Ollama | (none) | direct fetch (zero deps) | model-dep | Local daemon, NDJSON streaming, no key.|

Anthropic Claude (default)

Anthropic is the default provider and is wired through the official @ai-sdk/anthropic adapter. The provider reads ANTHROPIC_API_KEY from the environment automatically at call time — Makhlouf's own code never touches the env var.

export ANTHROPIC_API_KEY=sk-ant-...
makhlouf chat "Design a payments service"

Configure in .makhloufrc:

{
  "provider": "anthropic",
  "model": "claude-sonnet-4-6"
}

Supported models include claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5, claude-opus-4-5, claude-sonnet-4-5, and claude-opus-4-1. Sonnet 4.6 and Opus 4.6 carry a 1,000,000-token context window; see "Context Budget" above.

OpenRouter

OpenRouter exposes 300+ models through a single OpenAI-compatible API. Sign up at https://openrouter.ai to get an API key (free tier available for common models).

export OPENROUTER_API_KEY=sk-or-...
makhlouf chat "Analyze the wire-transfer flow" --model cheap

Configure in .makhloufrc:

{
  "provider": "openrouter",
  "model": "meta-llama/llama-3.3-70b-instruct"
}

Popular models:

  • meta-llama/llama-3.3-70b-instruct — 131,072-token context, tool-capable (free tier available via :free suffix)
  • mistralai/mistral-large-2407 — 128,000-token context, tool-capable
  • openai/gpt-4o / openai/gpt-4o-mini — 128,000-token context, tool-capable

See https://openrouter.ai/models for the full catalog. Any model ID that OpenRouter accepts works here — entries in src/context/model-caps.ts promote common models to their correct context length; unknown models fall back to a conservative 32k ceiling with a one-shot stderr warning.

Ollama (local)

Ollama runs LLMs locally on your machine — no API key, no network, no per-token cost. Install with brew install ollama (macOS) or from https://ollama.com, then start the daemon:

ollama serve          # in a separate terminal
ollama pull llama3.2  # download a tool-capable model
makhlouf chat "Say hello in one word." --model local

Configure in .makhloufrc:

{
  "provider": "ollama",
  "model": "llama3.2"
}

Override the daemon endpoint with OLLAMA_HOST:

# Remote Ollama daemon on a workstation
export OLLAMA_HOST=http://192.168.1.50:11434
makhlouf chat "test" --model local

The default endpoint is http://127.0.0.1:11434 (IPv4 literal, not localhost, to avoid ::1 resolution failures on IPv4-only Ollama deployments).

Tool calling support. Ollama's tool-calling support is model-dependent. Known tool-capable models: llama3.1, llama3.1:70b, llama3.2, llama3.2:3b, qwen2.5, qwen2.5-coder, qwen3, devstral, llama4. If you run Makhlouf against a non-tool-capable model (e.g. mistral, phi3), the provider silently omits the tools field from the request body — the agent can still generate text but will not receive tool calls. Use llama3.2 for a good default local coding experience.

ECONNREFUSED detection. If the Ollama daemon is not running, Makhlouf surfaces an actionable error:

error: ollama: connection refused at http://127.0.0.1:11434 — is `ollama serve` running? Set OLLAMA_HOST to point at a different endpoint.

This is a terminal error — the retry wrapper recognizes the message prefix and will not waste quota retrying a downed daemon.

Retry wrapper

All three providers are wrapped uniformly with an exponential-backoff retry layer (withRetry) applied inside the createProvider factory. Retry triggers on HTTP 408/429/500/502/503/504 and common network errors (ECONNRESET, ETIMEDOUT, ENOTFOUND, EAI_AGAIN). It fails fast on HTTP 400/401/403/404, user aborts (AbortSignal), and Ollama ECONNREFUSED. The classifier inspects the error string from the provider's event stream — there is no bypass code path.

Defaults: { base: 1000, maxAttempts: 4, maxDelay: 30000 }. Override in .makhloufrc:

{
  "retry": {
    "base": 500,
    "maxAttempts": 6,
    "maxDelay": 60000
  }
}

maxAttempts is bounded to [1, 10] and maxDelay to [0, 600000] (10 min) by the Zod schema, so misuse is bounded at the config layer. Setting maxAttempts: 1 disables retry entirely (one initial try, zero retries). The backoff formula is min(maxDelay, base * 2^attempt) + Math.random() * 1000ms — the constant-jitter floor prevents thundering-herd retry storms when multiple clients rate-limit simultaneously.

Model Aliases

Aliases let you type a short, memorable name (fast, cheap, local) instead of the full provider + model tuple every time. Aliases are resolved at config-load time, so --model cheap on the CLI and "model": "cheap" in .makhloufrc both work identically.

Built-in aliases

| Alias | Provider | Model | |------------|------------|--------------------------------------| | fast | anthropic | claude-haiku-4-5 | | smart | anthropic | claude-opus-4-6 | | balanced | anthropic | claude-sonnet-4-6 | | cheap | openrouter | meta-llama/llama-3.3-70b-instruct | | local | ollama | llama3.2 |

Use them on the command line:

makhlouf chat "Explain double-entry bookkeeping" --model fast
makhlouf chat "Design a microservice boundary" --model smart
makhlouf chat "Translate these error messages" --model cheap
makhlouf chat "Summarize this PR" --model local

Or pin one as the default in .makhloufrc:

{
  "model": "balanced"
}

When an alias resolves, Makhlouf rewrites both config.model and config.provider atomically — so the cross-provider aliases (cheap → OpenRouter, local → Ollama) pick up the correct env-var requirement automatically. If the required env var is missing (e.g. cheap needs OPENROUTER_API_KEY), the provider emits a clean error event with the exact env var name.

User-defined aliases

Extend or override aliases in .makhloufrc:

{
  "aliases": {
    "production": {
      "model": "claude-opus-4-6",
      "provider": "anthropic"
    },
    "experimental": {
      "model": "openai/gpt-4o",
      "provider": "openrouter"
    },
    "fast": {
      "model": "llama3.2",
      "provider": "ollama"
    }
  }
}

Resolution order:

  1. User aliases (from config.aliases) win first.
  2. Built-in aliases fall through when the name is not in user config.
  3. Unknown names pass through unchanged as literal model IDs.

The last rule means an unrecognized name is treated as a concrete model identifier — so --model claude-opus-4-6 still works without being defined as an alias. Overriding a built-in (e.g. redefining fast to point at Ollama) is a quiet override: no warning, no log line. It is an intentional opt-in.

No recursion. Alias resolution is a single pass. If a user alias's model field happens to match another alias name, the resolver returns the LITERAL string — it does not recurse. This prevents stack overflows on circular references and keeps behavior deterministic.

Session Persistence

Save a chat session to disk and resume it later with full conversation context, system prompt, knowledge files, and token budget preserved byte-for-byte.

Save a session

Add --save <name> to any makhlouf chat call. The session is written after the turn completes successfully (never on error, loop-cap, or abort):

makhlouf chat "Analyze the payment flow in the wire transfer app" --save analysis-1

After the turn finishes, Makhlouf writes <cwd>/.makhlouf/sessions/analysis-1.json containing:

  • The full conversation history (user + assistant + tool-call + tool-result turns)
  • The agent name used
  • The complete resolved .makhloufrc config (provider, model, tool policy, retry config)
  • The BudgetTracker snapshot (pinnedTokens + trailTokens from Phase 14)
  • A session UUID, ISO-8601 save timestamp, and schema version

Writes are atomic via temp + rename so a SIGKILL mid-save leaves the prior snapshot intact. Sessions over 10 MB emit a one-shot stderr warning but are not blocked — v3.0.0 treats session size as a user decision.

Resume a session

Add --resume <name> to continue a saved conversation. The new prompt becomes the next turn in the existing conversation:

makhlouf chat "Now audit the wire transfer path" --resume analysis-1

On resume, Makhlouf:

  1. Loads the snapshot and validates it against the Zod schema (rejects unknown fields from future-format files).
  2. Overrides the runtime config with the snapshot's recorded config — so the same model, provider, agent, and tool policy are used. (A v3.1 --allow-model-override flag is planned.)
  3. Instantiates a new BudgetTracker(model, caps, ceiling), pre-populates it via setPinned(snapshot.pinnedTokens) + setTrail(snapshot.trailTokens), and passes it to the executor. Phase 14's pre-populated short-circuit skips re-counting the knowledge block.
  4. Prepends snapshot.messages to the new user prompt so the provider sees the full history.
  5. Runs the turn and streams to the terminal as usual.

Save and resume in one invocation

Compose the flags to continue from one session into a new one:

makhlouf chat "Summarize findings into a report" --resume analysis-1 --save analysis-final

The state machine is strict: load-first / run / save-last. If the load fails the CLI exits with code 2 before rendering a single token. If the turn fails (loop-cap-hit, budget-exceeded, provider error, user abort), the save step is suppressed so partial streams never produce snapshots. If the save itself fails after a successful render, the CLI exits with code 2 and prints an actionable stderr message.

Error cases

  • SessionNotFoundError — the named session does not exist. The error carries an availableSessions list so the CLI can show "did you mean X?" without a second directory walk.
  • SessionReadError — the file exists but is corrupt JSON or fails Zod shape validation. Prints the file path and the Zod error.
  • SessionVersionError — the file was written by a newer CLI version. Carries foundVersion + expectedVersion fields so the migration path is obvious.

Storage location and name rules

Sessions live under <cwd>/.makhlouf/sessions/<name>.json. The cwd is the directory where makhlouf chat was invoked — each project gets its own session namespace naturally.

Session names are restricted to ^[a-zA-Z0-9_-]+$ — no path separators, no dots, no spaces, no shell metacharacters. Path traversal (.., /, \, null bytes, leading dot) is rejected at the sanitization gate. The same regex is enforced on both save and load paths as defence-in-depth, so an attacker who edits the sessions directory directly cannot trigger path traversal via a later --resume.

Add .makhlouf/ to .gitignore

Recommended. Sessions may contain sensitive banking code, customer data, or API conversation history — they are not encrypted at rest in v3.0.0. Add the directory to your project's .gitignore:

# .gitignore
.makhlouf/

Session encryption at rest is planned for v3.1.

Listing sessions

The session store exposes a listSessions(cwd) function, and future v3.1 work will surface it via makhlouf sessions list / delete / rename. For v3.0.0, use the filesystem directly:

ls .makhlouf/sessions/
cat .makhlouf/sessions/analysis-1.json | jq '.agent, .savedAt, (.messages | length)'

License

MIT License — see LICENSE

Author

Ahmed Makhlouf — Building the future of banking software, one command at a time.


Makhlouf v3.0.0 — 60,000+ lines of banking domain expertise, multi-provider LLM support, session persistence