@littlebearapps/cf-monitor
v0.3.7
Published
Self-contained Cloudflare account monitoring: error collection, feature budgets, circuit breakers, cost protection. One worker per account.
Downloads
1,459
Maintainers
Readme
Cloudflare Workers are great until a bug writes 4.8 billion D1 rows while you're asleep. cf-monitor wraps your workers with a single monitor() call, tracks every D1/KV/R2/AI/Queue operation, and shuts things down before they become expensive. One npm package, one worker per account, three CLI commands to production.
🛡️ Why cf-monitor?
Traditional monitoring SDKs need a central account, cross-account forwarding, HMAC secrets, and a fleet of workers to process telemetry. cf-monitor is different.
- Your account monitors itself — install on any Cloudflare account and it discovers all workers, tracks every binding call, and creates GitHub issues for errors. No central infrastructure, no cross-account secrets.
- Three commands from zero to monitored —
init,deploy,wire. Budget defaults auto-calculated from your plan. You can be monitoring in production before your coffee gets cold. - Circuit breakers that actually trip — per-invocation limits catch infinite loops on the first request. Daily budgets warn at 70%, stop at 100%. Monthly budgets align to your billing cycle. The runaway D1 loop from January 2026 ($3,434) would have been stopped at row 1,001.
- Zero D1, zero queues, zero migrations — metrics go to Analytics Engine (100M writes/month free). State lives in KV. No database schema to maintain, no queue infrastructure to provision.
- Fail-open by default — if cf-monitor has an internal error, your worker keeps running normally. Monitoring should never be the thing that breaks production.
- Built for solo developers — one worker per account, auto-discovery, auto-budgets, Slack alerts with dedup. Designed for people who ship fast and sleep well.
⚡ Quick Start
1. Install
npm install @littlebearapps/cf-monitor2. Set up the monitor worker
# Provision KV + AE, generate config files
npx cf-monitor init --account-id YOUR_ACCOUNT_ID
# Deploy the single monitor worker
npx cf-monitor deploy
# Auto-wire tail_consumers + bindings to all your worker configs
npx cf-monitor wire --apply3. Wrap your workers
import { monitor } from '@littlebearapps/cf-monitor';
export default monitor({
fetch: async (request, env, ctx) => {
const data = await env.DB.prepare('SELECT * FROM items LIMIT 100').all();
return Response.json(data);
},
scheduled: async (event, env, ctx) => {
await env.KV.put('last-run', new Date().toISOString());
},
});That's it. Worker name, feature IDs, bindings, and budgets are all auto-detected.
🎯 Features
- 🐛 Error collection — tail worker captures errors from every worker, deduplicates via fingerprint, creates GitHub issues with P0–P4 priority labels
- 💰 Feature budgets — per-feature daily and monthly limits with automatic circuit breakers. Warned at 70%, stopped at 100%
- 🔴 Circuit breakers — three-tier kill switches (feature, account, global) via KV. Auto-reset after configurable TTL
- 🛡️ Cost protection — per-invocation limits prevent runaway loops. Catches the $5K bug on the first request
- 📡 Gap detection — identifies workers that aren't sending telemetry. Shows where coverage is missing
- 🔍 Worker discovery — auto-discovers all workers via CF API. No manual registry needed
- 🔔 Slack alerts — budget warnings, errors, gaps, cost spikes. KV-based dedup so you don't get spammed
- 📈 Cost spike detection — flags when hourly costs exceed 200% of the 24-hour baseline
- ❤️ Synthetic health checks — hourly CB pipeline validation: trip, verify, reset, verify
- 📊 Plan detection — auto-detects Free vs Paid plan. Selects correct budget defaults automatically
- 📅 Billing period tracking — aligns monthly budgets to your actual billing cycle, not calendar months
- 📋 Account usage dashboard — queries CF GraphQL for Workers, D1, KV, R2, and Durable Objects. Shows % of plan used
- 🔧 Self-monitoring — tracks its own cron execution, error rates, and staleness. Alerts if cf-monitor itself is unhealthy
Optional (AI-powered, disabled by default)
- 🤖 Pattern discovery — AI detection of transient error patterns (opt-in)
- 📝 Health reports — natural language account health summaries (opt-in)
- 🔬 Coverage auditor — AI scoring of integration quality (opt-in)
🔧 SDK API
Zero-config (most workers)
import { monitor } from '@littlebearapps/cf-monitor';
export default monitor({
fetch: handler,
scheduled: cronHandler,
queue: queueHandler,
});Custom feature IDs
export default monitor({
features: {
'POST /api/scan': 'scanner:social', // Custom route feature
'GET /health': false, // Exclude from tracking
'0 2 * * *': 'cron:arxiv-harvest', // Custom cron feature
},
fetch: handler,
scheduled: cronHandler,
});Per-invocation limits
export default monitor({
limits: {
d1Writes: 500, // Throws RequestBudgetExceededError if exceeded
aiRequests: 10,
},
onCircuitBreaker: (err) => {
return new Response('Temporarily unavailable', { status: 503 });
},
fetch: handler,
});What monitor() auto-detects
| Setting | How it works | Manual override |
|---------|-------------|-----------------|
| Worker name | config.workerName > env.WORKER_NAME > env.name > 'worker' | workerName option or wire --apply |
| Feature IDs | {worker}:{handler}:{method}:{path-slug} | featureId, featurePrefix, or features map |
| Bindings | Duck-typing at runtime (D1, KV, R2, AI, Queue, DO, Vectorize, Workflow) | excludeBindings to skip specific keys |
| Budget defaults | Auto-detected from CF plan (free/paid) via Subscriptions API | budgets in config or config sync CLI |
| Health endpoint | /_monitor/health | healthEndpoint option or false to disable |
🏗️ Architecture
+-----------------------------------------+
| cf-monitor worker |
Your Workers -tail->| tail() -> fingerprint -> GitHub Issues |-> Issues
| | cron() -> metrics, budgets, |-> Slack
| | gaps, spikes, discovery |
| | fetch() -> /status, /errors, |
| | /budgets, /workers |
| +-----------------------------------------+
| |
+---- AE write --> Analytics Engine <-- AE SQL queryOne worker handles everything
| Handler | Schedule | Purpose |
|---------|----------|---------|
| tail() | Real-time | Error capture from all tailed workers |
| scheduled() | */15 * * * * | Gap detection, cost spike detection |
| scheduled() | 0 * * * * | CF GraphQL metrics, account usage collection, budget enforcement, synthetic CB health |
| scheduled() | 0 0 * * * | Daily rollup + warning digest, worker discovery |
| fetch() | On-demand | Status API, admin endpoints, GitHub webhooks |
Storage — zero D1
| Store | Purpose | Cost | |-------|---------|------| | Analytics Engine | All metrics and telemetry (90-day retention, SQL queries) | 100M writes/month free | | KV (1 namespace) | Circuit breaker state, budget config, error dedup, worker registry | Reads: $0.50/M, Writes: $5/M |
Bindings tracked
D1 (reads, writes, rows) · KV (reads, writes, deletes, lists) · R2 (Class A, Class B) · Workers AI (requests, neurons) · Vectorize (queries, inserts) · Queue (messages) · Durable Objects (requests) · Workflows (invocations)
💻 CLI Commands
| Command | Purpose | Key flags |
|---------|---------|-----------|
| npx cf-monitor init | Provision KV + AE, generate config | --account-id, --github-repo, --slack-webhook |
| npx cf-monitor deploy | Deploy the cf-monitor worker | --dry-run |
| npx cf-monitor wire | Auto-add tail_consumers + bindings to all worker configs | --apply, --dir |
| npx cf-monitor status | Show monitor health and CB states | --json |
| npx cf-monitor coverage | Show which workers are/aren't monitored | --json |
| npx cf-monitor secret | Set a secret on the cf-monitor worker | [name] |
| npx cf-monitor usage | Show account-wide CF service usage vs plan allowances | --json |
| npx cf-monitor config sync | Push budgets from YAML to KV | — |
| npx cf-monitor config validate | Validate cf-monitor.yaml against schema | — |
| npx cf-monitor upgrade | Safe npm update + re-deploy | --dry-run |
🌐 API Endpoints
The monitor worker exposes these endpoints. All GET endpoints include CORS headers (Access-Control-Allow-Origin: *) for browser-based monitoring dashboards.
| Method | Path | Purpose |
|--------|------|---------|
| GET | /_health | Health check (for Gatus or uptime monitors) |
| GET | /status | Account health, CB states, worker count |
| GET | /errors | Recent error fingerprints with GitHub issue links |
| GET | /budgets | Active circuit breakers and budget utilisation |
| GET | /workers | Auto-discovered workers on the account |
| GET | /plan | Detected plan type, billing period, days remaining, allowances |
| GET | /usage | Account-wide per-service usage with plan context (approximate) |
| GET | /self-health | Self-monitoring status: stale crons, error counts, handler breakdown |
| POST | /webhooks/github | GitHub webhook receiver (issue close/reopen/mute sync) |
| POST | /admin/cron/{name} | Manually trigger any cron (requires ADMIN_TOKEN) |
| POST | /admin/cb/trip | Trip a feature circuit breaker (requires ADMIN_TOKEN) |
| POST | /admin/cb/reset | Reset a feature circuit breaker (requires ADMIN_TOKEN) |
| POST | /admin/cb/account | Set account-level CB status (requires ADMIN_TOKEN) |
⚙️ Configuration
Generated by npx cf-monitor init — see full reference.
# cf-monitor.yaml
account:
name: my-project
cloudflare_account_id: "abc123..."
github:
repo: "owner/repo"
token: $GITHUB_TOKEN
alerts:
slack_webhook: $SLACK_WEBHOOK_URL
# Optional — sensible defaults auto-calculated from your CF plan
# budgets:
# daily:
# d1_writes: 50000
# kv_writes: 10000
# monthly:
# d1_writes: 1000000
# ai:
# enabled: false
# pattern_discovery: false
# health_reports: false📦 Requirements
- Node.js 20+ (22 recommended)
- npm 10+
- A Cloudflare account (Free or Paid — plan auto-detected)
- At least one deployed Worker on the account
- Wrangler CLI installed (
npm install -g wrangler) - A Cloudflare API token with:
- Workers KV Storage: Edit
- Account Analytics: Read
- Workers Scripts: Edit
- Optional: Account Settings: Read (for automatic plan detection)
ADMIN_TOKENsecret (recommended for production — protects admin endpoints). See Security
🔄 Upgrading
npm update @littlebearapps/cf-monitor
npx cf-monitor upgrade # re-deploys the monitor workerOr preview first:
npx cf-monitor upgrade --dry-runSee the changelog for version history.
📚 Documentation
Getting started
- Step-by-step setup — from install to verified monitoring
- Configuration reference — all YAML and SDK options
Guides
- Error collection — fingerprinting, dedup, GitHub issues
- Budgets & circuit breakers — 4 layers of cost protection
- Cost protection — the $4,868 story and how cf-monitor prevents it
- Worker discovery — auto-discovery, exclude patterns
- Slack alerts — alert types, dedup, webhook setup
- Plan detection — Free vs Paid, billing period, permissions
- Account usage — GraphQL queries, services, limitations
- Gap detection — coverage monitoring
- Self-monitoring — cron tracking, error counts, debugging cf-monitor itself
How-to
- GitHub webhooks — bidirectional issue sync setup
- Custom feature IDs — featureId, featurePrefix, features map
Security & Reference
- Security — admin auth, secrets, threat model, data exposure
- Troubleshooting — common issues with solutions
- Changelog — version history
🤝 Contributing
Contributions welcome! See CONTRIBUTING.md for development setup and guidelines.
git clone https://github.com/littlebearapps/cf-monitor.git
cd cf-monitor
npm install
npm test # 290 unit tests
npm run test:integration # 53 integration tests (needs CF credentials)
npm run typecheck # TypeScript strict mode🙏 Acknowledgements
The project was born from a $4,868 billing incident in January 2026, which proved that Cloudflare monitoring must be self-contained per account. The circuit breaker patterns, AE telemetry layout, and error fingerprinting algorithm were refined through months of real production use before becoming cf-monitor.
📄 Licence
MIT — Made by Little Bear Apps
