@hadi_ali/warden
v1.0.2
Published
Express middleware that scores requests and blocks bots/scrapers using header heuristics, IP reputation, rate limiting, and brute-force tracking.
Downloads
358
Maintainers
Readme
Warden 🛡️
A high-performance Layer 7 Web Application Firewall (WAF) and bot-mitigation middleware for Express.js.
Warden doesn't just look at rate limits; it analyzes how a client interacts with your server. By combining header heuristics, Chromium fingerprinting, honeypot traps(including SubDomains), and intelligent error tracking, it blocks automated scanners, scrapers, and brute-force tools while letting legitimate users and verified search engines pass through with near-zero overhead. 🚀 Installation
npm install warden(Optional) Install drivers if you plan to use persistent storage:
npm install pg redisRequires Node.js 18+ 🛡️ What it Protects Against
Warden acts as a behavioral firewall. It instantly detects and blocks:
Directory Fuzzers (Gobuster, ffuf, Nikto): Traps scanners hitting common sensitive paths (/.env, /wp-admin) and uses intelligent 404-tracking (which safely ignores missing images/CSS) to block rapid discovery attempts.
Headless Browsers & Scrapers (Puppeteer, Playwright, Selenium): Analyzes Chromium header anomalies. If a script claims to be Chrome 120 but is missing modern sec-ch-ua headers, or spoofs a Windows OS on a Linux machine, it is penalized and blocked.
Credential Stuffing & Brute-Forcing (Hydra, Intruder): Tracks repeated 401 Unauthorized and 400 Bad Request errors natively in the background, locking out attackers attempting to guess passwords.
Reconnaissance Bots: Penalizes requests targeting sensitive subdomains (e.g., admin.*, dev.*, vpn.*) to stop infrastructure mapping.
L7 Application DDoS: Implements dynamic rate limiting. Legitimate users get generous limits, while suspicious IPs are aggressively throttled.
Fake Search Engine Bots: Attackers spoofing Googlebot or Bingbot user-agents are blocked. Warden verifies true search engine IPs directly against Google and Bing's official CIDR ranges using high-speed bitwise math.⚠️ What it Does Not Do (Limitations)
To maintain sub-millisecond performance, Warden is designed as a behavioral WAF. You should be aware of what it does not cover:
Volumetric L3/L4 DDoS Attacks: If a botnet sends 50 Gbps of junk TCP traffic, Warden cannot save you. You still need Cloudflare or AWS Shield at the edge.
Payload Inspection (SQLi / XSS): Warden does not parse req.body to look for SQL injection or Cross-Site Scripting strings.💻 Usage
Warden is designed to be highly resilient. You can run it entirely in memory (Zero-Config), or attach Redis and PostgreSQL for cross-server rate-limiting and long-term IP reputation tracking. Option A: Zero-Config (In-Memory)
Perfect for single-server setups or quick deployments. Features a self-cleaning LRU cache to prevent memory leaks.
const express = require('express');
const { warden, syncVerifiedBots, reloadRanges } = require('warden');
const app = express();
// 1. Trust your proxy if behind AWS ELB, Heroku, or Cloudflare
app.set('trust proxy', 1);
// 2. Sync Google/Bing verified IPs into memory
syncVerifiedBots().then(success => {
if (success) reloadRanges();
});
// 3. Attach middleware
app.use(warden({
scorethreshold: 30,
knownSubdomains: ['api', 'www'] // Whitelist your legit subdomains
}));
app.get('/', (req, res) => res.send('Protected by Warden!'));
app.listen(3000);Option B: Production (Postgres + Redis)
For distributed setups. Redis handles cross-server rate limiting, and Postgres remembers bad IPs for days/weeks. If either database goes down, Warden safely fails open to memory.
const express = require('express');
const { Pool } = require('pg');
const { createClient } = require('redis');
const { warden, syncVerifiedBots, reloadRanges, migrate } = require('warden');
const app = express();
app.set('trust proxy', 1);
async function start() {
const db = new Pool({ connectionString: process.env.DATABASE_URL });
const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();
// Auto-create necessary tables (warden_reputation, etc.)
await migrate(db);
// Sync verified bots (Stores them in DB/Redis so other servers don't have to fetch)
const success = await syncVerifiedBots();
if (success) reloadRanges();
app.use(warden({
db: db,
redis: redis,
scorethreshold: 30,
failopen: true, // If Redis/PG crashes, API stays online
}));
app.get('/', (req, res) => res.send('Enterprise protection active!'));
app.listen(3000);
}
start();(Note: You can also run migrations via CLI: CONNECTION_STRING="..." node node_modules/warden/app/db_connection/migrate.js)
🛠️ CLI
After npm install -g warden (or npx warden) you get a warden command:
warden --help # Show all commands
warden --version # Show version
warden cleanup --help # Cleanup optionswarden cleanup — Data Retention
Warden never deletes your data automatically. The cleanup command is the supported way to purge old rows from the three Warden Postgres tables. Schedule it yourself — cron, Kubernetes CronJob, pg_cron, GitHub Actions schedule, anything.
# Defaults: reputation > 3 days, requests > 24 hours, sessions > 30 days
warden cleanup
# See what would be deleted without actually deleting
warden cleanup --dry-run
# Custom retention
warden cleanup --reputation-days 7 --requests-hours 48 --sessions-days 90
# Override the connection string
warden cleanup --connection postgres://user:pass@host/dbSchedule with cron (Linux):
# Run every hour
0 * * * * cd /srv/myapp && /usr/bin/warden cleanup >> /var/log/warden.log 2>&1Schedule with Kubernetes CronJob:
apiVersion: batch/v1
kind: CronJob
metadata:
name: warden-cleanup
spec:
schedule: "0 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: cleanup
image: node:20-alpine
command: ["warden", "cleanup"]
env:
- name: CONNECTION_STRING
valueFrom:
secretKeyRef:
name: warden-db
key: connection-string
restartPolicy: OnFailureSchedule with pg_cron (no app process needed):
SELECT cron.schedule('warden-cleanup', '0 * * * *', $$
DELETE FROM warden_reputation WHERE score < 20 AND last_seen < NOW() - INTERVAL '3 days';
DELETE FROM warden_requests WHERE timestamp < NOW() - INTERVAL '24 hours';
DELETE FROM warden_sessions WHERE last_seen < NOW() - INTERVAL '30 days';
$$);⚙️ Configuration Options
Option Type Default Description db pg.Pool null PostgreSQL pool for long-term reputation tracking and verified bot storage. redis RedisClient null Redis client for distributed rate limiting. Falls back to in-memory LRU store. allowlist string[] [] Array of exact IPs or CIDR ranges (e.g. ['10.0.0.0/8']) to always bypass checks. failopen boolean true If true, errors within Warden allow the request to proceed. If false, returns 500. scorethreshold number 30 The heuristic score at which a request is blocked with a 403 Forbidden. suspiciousThreshold number 30 The score at which the onSuspicious webhook/callback fires. trustHeaders boolean false Security: By default, Warden trusts req.ip. Set to true only if you safely parse X-Forwarded-For upstream. knownSubdomains string[] [] Subdomains to ignore in the recon-scanner (e.g., ['api', 'app']). limits object {...} Custom request limits for clean, suspicious, and hostile tiers. getSessionId function null Custom function to extract a user session ID from req for session-based scoring. onBlock function null Callback fired on block: (ipHash, score, reason, req) => {} onSuspicious function null Callback fired on suspicious traffic: (ipHash, score, req) => {} onRateLimit function null Callback fired on 429: (ipHash, score, req) => {} ⚡ Performance & Architecture
Warden is engineered to add < 2ms of latency to legitimate requests.
Zero-Score DB Bypass: If a user connects with a standard, valid browser, their heuristic score evaluates to 0. Warden immediately skips the PostgreSQL reputation UPSERT entirely, saving heavy database I/O.
Post-Response Math: Brute-force calculations (like tracking 404s/401s) are executed asynchronously via res.on('finish'). The user receives their HTTP response instantly; the math happens in the background.
Bitwise CIDR Matching: Verified bots are resolved using 32-bit integer conversion and bitwise masking, allowing hundreds of Google/Bing CIDR ranges to be evaluated in fractions of a millisecond.📝 License
MIT License © 2025
