npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

cliagent-council

v0.3.3

Published

Convene a panel of CLI-based AI agents to deliberate on engineering problems

Downloads

96

Readme

Agent Council

Convene a panel of CLI-based AI agents to deliberate on your questions. Three models answer independently, review each other's work, and the invoking agent synthesizes the verdict as chairman.

Works with Claude Code, Codex CLI, and Gemini CLI. Whichever tool you invoke from becomes the chairman. The others are council members.

Inspired by Karpathy's LLM Council, adapted for the CLI agent ecosystem.

/council "Should we use Postgres or DynamoDB for our event sourcing system?"
Dispatching Stage 1 to 3 agents in parallel...
  - claude (timeout: 120s)
  - codex (timeout: 120s)
  - gemini (timeout: 180s)
  claude responded (38.2s)
  codex responded (52.1s)
  Quorum reached (2/3). Giving stragglers 30s grace...
  gemini responded (64.7s)
  All 3 agents responded.

Stage 1 complete: 3/3 successful opinions

--- CHAIRMAN SYNTHESIS (claude) ---

### Consensus
All agents agree: Postgres is the right choice given strong consistency
requirements and team SQL experience.

### Divergence
Claude emphasizes ACID guarantees as non-negotiable for account balances.
Codex flags a scaling ceiling at ~10TB without sharding.
Gemini suggests read replicas as a scaling bridge.

### Confidence
HIGH — Strong consensus across models.

Why Agent Council?

Every existing LLM council is API-call-based. Karpathy's LLM Council, Perplexity Model Council, Council AI... they all pass text through API endpoints. Agent Council is different:

  1. Grounded deliberation. Council members are CLI agents with tool access. They can grep your codebase, read migration files, run git log. Opinions are grounded in your actual project, not abstract text generation.

  2. Zero marginal cost. You're tapping into subscriptions you already have (Claude Code, Codex, Gemini CLI). No new API tokens to buy.

  3. Living decisions. Every deliberation is a hypothesis that can be re-evaluated. "We chose Postgres 3 months ago... re-run with what we know now." Use /council-revisit to compare then vs now.

Quick Start

Install via npm

npx cliagent-council

This clones the repo, installs skills for all detected CLI agents, and you're ready to go.

Or install manually

git clone https://github.com/yogirk/agent-council.git
cd agent-council
./setup

Platform: macOS and Linux. Windows users: use WSL.

Requirements: Bun + at least 2 of these CLI agents:

  • Claude Code (claude) — skills install to ~/.claude/skills/
  • OpenAI Codex (codex) — skills install to ~/.agents/skills/
  • Gemini CLI (gemini) — skills install to ~/.gemini/skills/

Usage

As a skill (Claude Code, Codex CLI, Gemini CLI)

The same slash commands work in all three CLIs. The invoking agent automatically becomes the chairman.

/council "Should we use WebSockets or SSE for real-time updates?"
/council --with-review "Review auth middleware for security issues"
/council --quick "What's the best job queue for Node.js?"

/council-list                              # List all past sessions
/council-replay council-20260329-143000    # Replay a session in terminal
/council-revisit council-20260329-143000   # Re-run with current context (living decisions)
/council-outcome council-20260329-143000 "It worked great"  # Record outcome
/council-nudge council-20260329-143000 --agent codex --correction "Our data will never exceed 100GB"

When invoked from Claude Code, Claude is chairman. From Codex, Codex is chairman. From Gemini, Gemini is chairman. The chairman gives its own independent opinion in Stage 1, then synthesizes all opinions in Stage 3.

From the command line

# Fast mode (default): opinions + synthesis
bin/council --question-file question.txt --project myapp

# Specify chairman explicitly (auto-detected if omitted)
bin/council --question-file question.txt --chairman codex --project myapp

# With peer review
bin/council --question-file question.txt --project myapp --with-review

# Browse past sessions
bin/council list --project myapp
bin/council replay council-20260329-143000 --project myapp

# Nudge: challenge an agent's assumptions after a session
bin/council nudge council-20260329-143000 --agent codex --correction "Budget is not a constraint" --project myapp

# Skip preflight health checks (faster startup)
bin/council --question-file question.txt --project myapp --skip-preflight

How It Works

                    +------------------+
                    |   Your Question  |
                    +--------+---------+
                             |
              Stage 1: Independent Opinions
                             |
            +----------------+----------------+
            |                |                |
      +-----------+    +-----------+    +-----------+
      | Claude    |    | Codex     |    | Gemini    |
      | Code      |    | CLI       |    | CLI       |
      +-----------+    +-----------+    +-----------+
            |                |                |
            v                v                v
      [Opinion A]      [Opinion B]      [Opinion C]
            |                |                |
            +----------------+----------------+
                             |
               Stage 2: Anonymized Peer Review
                        (optional: --with-review)
                             |
                Stage 3: Chairman Synthesis
                             |
                    +------------------+
                    |  Final Verdict   |
                    |  with consensus  |
                    |  and dissent     |
                    +------------------+
                             |
               Stage 4: Targeted Nudge (optional)
                    "Your assumption about X is wrong"
                             |
                    +------------------+
                    | Updated Opinion  |
                    | with what changed|
                    +------------------+

Stage 1: ALL agents (including the chairman) answer independently, in parallel. Each gets your question + codebase context. No visibility into what others are producing. Once a quorum of opinions arrives, a grace window starts for slower agents.

Stage 2 (optional): Each agent reviews the others' anonymized opinions. Scores them on correctness, completeness, and feasibility. Produces a ranking.

Stage 3: The chairman (whichever CLI you invoked from) reads all opinions (including its own from Stage 1) and synthesizes: where they agree, where they diverge, and a final recommendation with confidence level. When agents fundamentally disagree, the synthesis flags it explicitly with per-agent confidence so you can decide.

Stage 4 (optional): After reading the verdict, you can nudge a specific agent: "Your assumption about X is wrong." The agent reconsiders with your correction and produces an updated recommendation explaining what changed and what stayed the same.

Configuration

Create ~/.council/config.json to customize models, timeouts, and quorum behavior:

{
  "models": {
    "claude": "claude-opus-4-6",
    "codex": "gpt-5.4",
    "gemini": "gemini-3.1-pro"
  },
  "timeout_ms": {
    "claude": 120000,
    "codex": 120000,
    "gemini": 180000
  },
  "quorum_grace_ms": 30000
}

All fields are optional. Missing fields use the defaults shown above.

  • timeout_ms: Per-agent timeout in milliseconds. Gemini defaults to 180s (it's slower). Can also be a single number applied to all agents.
  • quorum_grace_ms: Once enough agents respond (quorum), stragglers get this grace window before the council proceeds without them. Default: 30s.

Storage

Council sessions are stored in ~/.council/{project}/. Each session contains:

  • meta.json — question, agents, mode, timestamp
  • stage1/opinion_*.json — individual agent opinions
  • stage2/review_*.json — peer reviews (if --with-review)
  • stage4/nudge_*.json — nudge results (if nudge was used)
  • synthesis.json — chairman's final verdict
  • viewer.html — interactive viewer (open in browser)

Viewer

Every council session generates a self-contained HTML viewer. Open it in your browser to explore:

  • Editorial monograph layout with tonal surface layering (no borders)
  • Tabbed agent opinions with full-width reading area showing recommendation, reasoning, assumptions, tradeoffs, belief triggers, and dissent
  • Verdict section with consensus/divergence cards and horizontal metadata strip
  • Peer review matrix with color-coded score pips and hover tooltips
  • Nudge timeline with before/after diffs and "what changed" explanations
  • Light and dark mode with toggle (respects system preference)
  • Agent identity via colored geometric icons
  • Outcome banners when a decision outcome has been recorded
  • Revisit comparison side-by-side when viewing a revisited session
  • Self-contained HTML, zero external dependencies, responsive, XSS-safe

Does It Work?

We ran 3 benchmark questions through the council and compared against a single agent (Claude Opus 4.6). The council consistently found more considerations:

| Benchmark | Single Agent | Council | Delta | |-----------|-------------|---------|-------| | Database choice (Postgres vs DynamoDB) | 1/5 (20%) | 3/5 (60%) | +2 | | Error handling (exceptions vs Result types) | 0/5 (0%) | 1/5 (20%) | +1 | | Deployment (Kubernetes vs Docker Compose vs PaaS) | 3/5 (60%) | 4/5 (80%) | +1 | | Average | 27% | 53% | +1.3 |

The council found nearly 2x as many expected considerations. This measures consideration coverage (did the response mention scaling? cost? team experience?), not answer quality. Run your own eval: bun run eval/run-eval.ts --dry-run to see all 10 benchmarks. The test suite has 59 tests with 133 assertions covering adapters, parsing, prompts, preflight, nudge, and viewer generation.

Proactive Suggestions

Agent Council can suggest /council when it detects you're making a decision with trade-offs. After setup, an ambient skill watches for patterns like:

  • "should we use X or Y" → suggests /council
  • Referencing past decisions → suggests /council-revisit
  • Old council sessions without outcomes → suggests /council-outcome

Suggestions are quiet (a single line after the response), max 2 per session, and never interrupt your flow. Disable in ~/.council/config.json:

{ "proactive": false }

Use Cases

  • Architecture decisions: "Postgres vs DynamoDB for event sourcing at 10k events/sec?"
  • Code review: "Review this auth middleware for security issues"
  • Debugging: "Our API latency spiked 3x after commit abc123. Most likely cause?"
  • Technology selection: "Compare BullMQ, Agenda, and bee-queue for our Node.js job queue"
  • General questions: Works for any question, not just engineering

What This Is Not

  • Not a replacement for single-agent work. Most tasks don't need a council. Use for decisions with meaningful trade-offs.
  • Not a code generation tool. The council deliberates and recommends. It doesn't write code collaboratively.
  • Not cheap in time. Expect 60-120 seconds for fast mode. This is a "stop and think" tool.
  • Not real-time. Parallel dispatch helps, but CLI agents take time.

Roadmap

  • v0.1.0 (done): Three-stage deliberation, 3 adapters, cross-platform skills (Claude Code + Codex + Gemini), living decisions, outcome tracking, security hardening, progressive output, proactive nudge system, evaluation benchmarks
  • v0.2.0 (done): Editorial monograph viewer redesign, consensus KPI fix, contextual nudges in skill flow
  • v0.3.0 (done): Preflight health checks, error classification, retry wrapper, nudge subcommand (challenge agent assumptions), assumptions/belief trigger parsing, 59 tests
  • Next: Shareable deliberation exports, calibration profiles (which model is best at what), input snapshotting for reproducible sessions

License

MIT