@dvg-os/agency-os

v0.1.5

Published

5 days ago

AI marketing operating system distributed as a Claude Code plugin. BYOK multi-tenant architecture with 37 named agents, 8 departments, strict 0→5 lead pipeline, Cerberus + Hemingway quality gates, Minerva-backed RAG.

Downloads

694

0High
0Medium
0Low

dalistovisionnpm

claude-code claude-code-plugin mcp ai-agents marketing agency outreach hubspot brevo meta rag anti-hallucination

agency-os

An AI marketing operating system that runs inside your Claude Code, against your own keys, with your own customer data.

40 named agents. Eight departments. One strict 0→5 lead pipeline. Kill switches by default. No backend, no shared database, no telemetry of customer content.

Status: active engineering. 430 tests, all green on main. Plugin installer ships. Internal validation pass next, then the first paying client.

Why this exists

Marketing automation today is closed SaaS. GoHighLevel charges $97–497/month for a black box. Lemlist, Apollo, Reply, Smartlead — all closed, all bolt-on AI, none extensible.

For an agency that already uses Claude Code and wants to own its workflow, there is no agent-native, MCP-powered, BYOK marketing OS. agency-os is that gap.

Open core. Customer keys. Customer data. Customer rules. The methodology is the product; the SaaS is optional.

What it does

Enriches a lead — Sherlock 10-point company analysis, brand match, decision-maker discovery, location enrichment.
Writes outreach — four drafts in your brand voice, signature templates injected, every sentence judged before send.
Sends outreach — Brevo delivery, double-gated by EMAIL_SENDING_ENABLED=true AND BREVO_MODE=live. Test recipient allowlist for dry runs.
Monitors replies — Brevo events sync back to HubSpot status fields (delivered, opened, clicked, bounced, replied).
Launches Meta ads — Hawkeye agent. Campaigns default to PAUSED on creation; you flip them ENABLED yourself.
Reads Google Ads — Oracle agent. Reads campaigns + 30-day metrics; writes are kill-switched.
Generates creative — kie.ai (Da Vinci images, Spielberg video, Mozart audio). Costs real credits when KIE_MODE=live.
Audits configuration — /audit-config scans for hardcoded secrets, missing required fields, language rule violations, PII patterns.

Every paid action is off by default. You explicitly flip a kill switch to spend money or send mail.

Install

One command from any terminal:

npx @dvg-os/agency-os install

That downloads the latest signed release from the npm registry (SLSA-provenance attested), copies the plugin into ~/.claude/marketplaces/dalistovision-agency-os/, and wires it into Claude Code's settings.json. Prereqs: Node.js 22+ and Claude Code.

Then open (or restart) Claude Code and run:

/onboard

The wizard walks you through each integration — what it is, why we need it, where to get the key, what permissions it grants, what happens if you skip. Live read-only validation runs after each key. Resume support — close and come back.

To scaffold your per-customer config into the current project:

/init-agency-os "Your Agency Name"

That copies a 15-file template: CLAUDE.md, five config files, three rule files, three pre-commit hooks, plus a PostToolUse hook that warns on language + currency rule violations.

Upgrading later is one command — npx @dvg-os/agency-os@latest install. The MCP server checks npm once per day and prints an upgrade banner on Claude Code startup when a new version ships. Silence it with AGENCY_OS_DISABLE_UPDATE_CHECK=1.

Starting from a fresh laptop? The full novice walkthrough (installing Node, Claude Code, and the plugin step by step) lives in docs/install.md.

The customer journey

The lead status state machine — strictly 0→5, no skipping. Each transition has one driving operator skill and (where money or email is involved) explicit kill-switch gates the operator must flip.

                                                       ┌─ KILL SWITCHES ─┐
   ┌────────┐  /enrich-lead   ┌──────────┐             │  EMAIL_SENDING  │
   │  new   │ ──────────────▶ │ enriched │             │   _ENABLED+     │
   │ (0)    │                 │   (1)    │             │  BREVO_MODE     │
   └────────┘                 └─────┬────┘             │   _live         │
                                    │ /match-brand     │  +TEST_RECIPIENT│
                                    ▼                  │  +daily budget  │
   ┌────────┐  /write-outreach ┌──────────┐  /send-   ┌────────────────┐
   │ matched│ ───────────────▶ │ drafted  │ outreach▶ │ outreach_sent  │
   │  (2)   │                  │   (3)    │           │      (4)       │
   └────────┘                  └──────────┘           └────────┬───────┘
       ▲                            │                          │
       │                  ╔═════════╧══════════╗               │
       │                  ║ Cerberus + Hemingway║              │
       │                  ║ combined ≥ 0.85?    ║              │ Brevo events
       │                  ╚═════════╤══════════╝               │ (delivered, opened,
       │                            │                          │  bounced, replied)
       │              ┌─ FAIL ──────┴──── PASS ──┐             │
       │              ▼                           ▼             ▼
       │   ┌──────────────┐              (continue           ┌──────────┐
       │   │ human-review │              to send)            │  reply   │
       │   │   queue      │                                  │   (5)    │
       │   └──────────────┘                                  └────┬─────┘
       │                                                          │
       └──────────── /monitor-replies (Zola classifies) ──────────┘
                          ↓                ↓               ↓
                     interest         objection       unsubscribe
                  (human follows)  (90-day pause)   (global suppression)

Linear flow as commands:

Lead lands in HubSpot
   │
   ├─▶ /enrich-lead <hubspot-id>
   │      Sherlock pulls website, news, LinkedIn, Google Maps
   │      Jordan matches the lead to one of YOUR brands
   │      Minerva stores the analysis with citations
   │
   ├─▶ /write-outreach <hubspot-id>
   │      4 drafts in your brand voice (config/brands.json)
   │      Cerberus checks every factual claim against Minerva
   │      Hemingway scores tone + length + locale
   │      Combined < 0.85 → human review queue
   │
   ├─▶ /send-outreach <hubspot-id>
   │      Refuses unless EMAIL_SENDING_ENABLED=true AND BREVO_MODE=live
   │      Test recipient allowlist enforced
   │      Brevo send + HubSpot note logged
   │
   └─▶ /monitor-replies
          Brevo events → HubSpot status updates
          Replies routed to triage

Optional parallel tracks: Meta ads (/launch-meta-campaign), creative generation (/generate-video), niche research (YouTube), location data (Google Maps), Google Ads insights (Oracle agent).

Architecture (seven layers)

┌──────────────────────────────────────────────────────────────────┐
│ 1  Agent Runtime           │  Claude Code (host, customer machine)│
├──────────────────────────────────────────────────────────────────┤
│ 2  Operator Skills         │  /onboard  /enrich-lead              │
│    (what the customer       │  /write-outreach  /send-outreach     │
│     types in Claude Code)   │  /monitor-replies  /launch-meta-...  │
│                             │  /generate-video  /audit-config      │
│                             │  /init-agency-os                     │
├──────────────────────────────────────────────────────────────────┤
│ 3  Agent Skills            │  Hermes · Sherlock · Jordan · Minerva│
│    (40 named agents,        │  Cerberus · Hemingway · Harvey       │
│     8 departments)          │  Hawkeye · Oracle · Da Vinci         │
│                             │  Spielberg · Mozart · Zola ...       │
├──────────────────────────────────────────────────────────────────┤
│ 4  Pipeline Orchestrator   │  Strict 0→5 state machine            │
│                             │  Cerberus + Hemingway quality gates  │
│                             │  Per-run + per-day budget enforcement│
│                             │  Decision replay (FileRunRecorder)   │
├──────────────────────────────────────────────────────────────────┤
│ 5  Minerva KG + MCP Tools  │  SQLite WAL knowledge graph          │
│                             │  10-level customer context schema    │
│                             │  Bundled MCP server (stdio)          │
├──────────────────────────────────────────────────────────────────┤
│ 6  Provider Adapters       │  HubSpot · Brevo · Meta · kie.ai     │
│    + StateMachine port      │  YouTube · Google Maps · Scraper     │
│                             │  Google Ads (Oracle)                 │
│                             │  Circuit breaker · rate limit · bulkhead │
├──────────────────────────────────────────────────────────────────┤
│ 7  Configuration            │  Per-customer JSON in config/        │
│                             │  brands · customers · signatures     │
│                             │  language · thresholds               │
└──────────────────────────────────────────────────────────────────┘

Cross-cutting: structured logs (Pino, 10 credential families auto-redacted),
distributed tracing (OpenTelemetry, lazy-loaded), metrics (Prometheus + JSON
via agency_os_metrics), error tracking (Sentry, lazy-loaded), kill switches
at every paid-action boundary, pre-commit + CI secret scanning.

| # | Layer | What lives here | |---|---|---| | 1 | Agent Runtime | Claude Code (host) | | 2 | Operator Skills | /onboard, /enrich-lead, /write-outreach, /send-outreach, /monitor-replies, /launch-meta-campaign, /generate-video, /audit-config, /init-agency-os, /onboard | | 3 | Agent Skills | 40 named agents in 8 departments — Hermes, Cerberus, Hemingway, Sherlock, Jordan, Minerva, Harvey, Hawkeye, Oracle, Da Vinci, Spielberg, Mozart, etc. | | 4 | Pipeline Orchestrator | Strict 0→5 state machine, parallel video + outreach pipelines, quality gates, per-run + per-day budget enforcement | | 5 | Minerva KG + MCP Tools | SQLite WAL knowledge graph, 70+ MCP tools exposed via the bundled MCP server | | 6 | Provider Adapters + StateMachine port | 8 adapters: HubSpot, Brevo, Meta, kie.ai, YouTube, Google Maps, Scraper, Google Ads. HubSpot is the default StateMachine implementation; Pipedrive/Salesforce are swappable. | | 7 | Configuration | Per-customer JSON in config/, no shared state |

Cross-cutting: structured logs (Pino), distributed tracing (OpenTelemetry, lazy-loaded), metrics counters (Prometheus + JSON), error tracking (Sentry, lazy-loaded), kill switches enforced at the adapter layer, in-process resilience (circuit breaker + rate limiter + bulkhead per adapter).

Safety + observability

Kill switches. Every paid action defaults to off:

| Action | Required env to enable | |---|---| | send_email | EMAIL_SENDING_ENABLED=true AND BREVO_MODE=live | | hubspot_write | HUBSPOT_WRITE_MODE=live | | meta_write | META_WRITE_MODE=live | | kie_generate | KIE_MODE=live | | google_ads_write | GOOGLE_ADS_WRITE_MODE=live |

Cost budgets. Per-run caps in the orchestrator + persistent per-day caps that survive process restarts and reset at UTC midnight. Set DAILY_TOKEN_INPUT_MAX, DAILY_TOKEN_OUTPUT_MAX, DAILY_API_CALLS_MAX. When breached, active runs abort and new runs refuse to start.

Secret redaction. Pino scrubs 10 credential families (HubSpot PAT, Brevo API + SMTP, Bearer, Meta Graph, Google API, OpenAI, Anthropic, GitHub OAuth + PAT) from every log line. A pre-commit hook + CI workflow run the same regex against committed files; the only way past is --no-verify, which CI then catches.

SSRF defense. Scraper-facing URLs route through safeFetch which blocks loopback, link-local (incl. cloud metadata 169.254.169.254), RFC1918 private, carrier-grade NAT, IPv6 ULA + link-local.

Quality gates. Cerberus (factual) + Hemingway (linguistic) score every draft. Combined < threshold → human review, no send.

Anti-hallucination. Every agent run is recorded for replay. Every Claude API call uses messages.parse() with a Zod schema. Every factual claim must cite a Minerva source. Eval framework runs on every PR; release blocked if hallucination rate > 1% on the golden set. Prompts are versioned. Models are pinned.

Sentry + OTel. Lazy-loaded — both SDKs are imported only when a customer sets the corresponding endpoint. Customers without observability backends pay zero cold-start cost.

See SECURITY.md for the full threat model and per-defense file references.

Bring your own keys

| Key | Required? | Used for | |---|---|---| | HUBSPOT_API_KEY | yes | CRM read/write (kill-switched on writes) | | BREVO_API_KEY | yes | Email send (double-kill-switched) | | META_SYSTEM_USER_TOKEN | optional | Meta ads + page management | | KIE_AI_API_KEY | optional | AI image / video / audio generation | | GOOGLE_MAPS_API_KEY | optional | Company location enrichment | | YOUTUBE_API_KEY | optional | Niche research | | GOOGLE_ADS_DEVELOPER_TOKEN + _ACCESS_TOKEN + _CUSTOMER_ID | optional | Oracle agent (Google Ads reads) | | MCP_API_KEY | optional (auto if omitted) | Agency-os MCP server bearer auth | | SENTRY_DSN | optional | Error tracking | | OTEL_EXPORTER_OTLP_ENDPOINT | optional | Distributed tracing |

All sensitive keys land in your OS keychain via Claude Code's userConfig mechanism — never in code, never on agency-os infrastructure (there is no agency-os infrastructure to send them to).

What this is NOT

Not a SaaS. No backend, no shared DB, no multi-tenant infra. Each install is a single-process plugin running on the operator's machine.
Not a CRM. It uses HubSpot (or a swappable equivalent). It does not store the contact list itself.
Not a hosted email service. It uses Brevo. The send happens on Brevo's infra under the customer's account.
Not a closed black box. The methodology, the prompts, the agents, the schemas — all are in this repo, all are MIT-licensed, all are forkable.

Status (2026-04-22)

Engineering: Phase 4 observability complete. Phase 5 plugin installer complete. Phase 6 project template + contract test complete. Phase 7 Claude Code audit hooks complete. 430 tests, CI + Security green on every push.
Phase 8 (you are here): documentation pass.
Phase 9 next: internal validation — install on a clean machine, run the full pipeline against sandbox accounts, fix every gap.
Phase 10: first paying client (closed beta).
Phase 11: public launch — pricing, ToS, billing, landing page.

See SECURITY.md §5 for the four remaining engineering Roadmap items (all Phase 12.5 enterprise-tier).

Documentation

SECURITY.md — threat model, defenses, vulnerability disclosure
docs/architecture.md — the seven layers in depth
docs/install.md — install guide
docs/byok.md — per-service credential setup
docs/worked-example.md — full lead lifecycle
docs/troubleshooting.md — known failure modes + fixes
docs/faq.md — common questions
docs/validation.md — Phase 9 internal validation playbook (pnpm validate)
docs/pricing.md — Phase 11 pricing tier draft (working document, not a public commitment)
CHANGELOG.md — release notes
VERSIONING.md — SemVer + plugin pinning policy
OPERATIONS.md — rollback runbook + secret rotation procedure
CONTRIBUTING.md — engineering practices, PR workflow

Contributing

Engineering practices in CONTRIBUTING.md. In short: branch protection on main, every PR runs typecheck + lint + build + test + secret scan, no merge without green CI, conventional commits, SemVer, Renovate auto-PRs for dep updates.

Bug reports: GitHub Issues. Vulnerability reports: please email rather than open a public issue — see SECURITY.md §7.

License

MIT. See LICENSE.

Repository

github.com/dalistovision/agency-os (currently private; opens up around Phase 11 launch.)