@broberg/secret-scan

v0.1.5

Published

3 days ago

Pure, dependency-free secret/credential redaction for the broberg.ai fleet — redactSecrets / hasSecret over a curated, ordered SECRET_PATTERNS set. Redact at write + egress boundaries so keys never land in a DB, chat, or KB. Lifted from broberg/trail F197

0High
0Medium
0Low

cbroberg

secret redaction credentials security api-key token secret-scanning broberg

@broberg/secret-scan

Pure, dependency-free secret/credential redaction for the broberg.ai fleet. Catch leaked API keys and tokens at your write + egress boundaries so a key never lands in a database, a chat answer, a search result, or a shared knowledge base.

Lifted from broberg/trail F197 — the second-brain safeguard that found 9 real leaked keys already sitting in a shared KB. components owns + publishes it; every repo consumes the same canonical pattern set, so detection never drifts.

npm i @broberg/secret-scan

Usage

import { redactSecrets, hasSecret } from "@broberg/secret-scan";

const { redacted, findings } = redactSecrets("the key is sk-ant-api03-… use it");
// redacted → "the key is [REDACTED:anthropic-api-key] use it"
// findings → [{ label: "anthropic-api-key", count: 1 }]

hasSecret("nothing here"); // false

redactSecrets is pure + deterministic: clean input returns byte-identical with findings: []. It replaces every detected secret with [REDACTED:<label>] and never blocks the write — the surrounding knowledge survives.

Two recommended integration shapes

Write boundary (ingest gate) — redact before you persist, so secrets never enter storage:
```
await db.insert({ content: redactSecrets(content).redacted });
```
Egress guardrail — scrub before a value leaves to a user or an LLM. The highest-value guard is scrubbing retrieved context before it enters a prompt, so the model can never see (and never echo) a secret that predates the gate.

Custom / per-tenant patterns

Add your own patterns on top of the canonical set — they run after the canonical patterns, so canonical attribution always wins:

redactSecrets(text, {
  extraPatterns: [{ label: "acme-key", description: "ACME key", regex: /\bACME-[0-9]{6}\b/g }],
});

What it detects

A curated, ordered set (SECRET_PATTERNS) of named, low-false-positive regexes — most-specific first so attribution is correct:

LLM: Anthropic (sk-ant-…, incl. oat01-), OpenAI (sk-/sk-proj-), OpenRouter (sk-or-v1-), ElevenLabs, fal.ai, Google/Gemini (AIza…), Google OAuth (GOCSPX-), Mistral (field-anchored).
Cloud / infra: AWS (AKIA…), GitHub, GitLab, Slack, Stripe live, Resend, Fly.io, Cloudflare (global key · API token via field-context · Turnstile secret), Supabase (sbp_ / sb_secret_), npm (npm_…).
Fleet: upmetrics (uk_), cardmem (pa_/pi_/pk_, piw_), cms (wh_), trail (trail_), cronjobs (cj_ + 43 base64url).
Generic: JWT (eyJ… — also Turso + Supabase service_role tokens), PEM private-key blocks, Discord bot/MFA tokens, and labeled-hex-secret (a 40+ hex value assigned to a secret/token/password/api-key-named field).
Field-anchored (context-only, to avoid FP on bare tokens): Cloudflare API token, Mistral, Vimeo — matched only next to their env-var name.

Design notes

Pattern-based, not entropy — a redacted real fact corrupts knowledge, so we accept missing an exotic token over false-positiving.
Never a bare hex pattern — it would hit git shas/hashes. Prefix-less service secrets are caught only via the labeled-hex-secret name-context rule.
Order is API — specific patterns run before generic ones (sk-ant- before sk-); a test asserts it.

API

interface SecretPattern { label: string; description: string; regex: RegExp; }
interface RedactionFinding { label: string; count: number; }
interface RedactionResult { redacted: string; findings: RedactionFinding[]; }
interface RedactOptions { extraPatterns?: SecretPattern[]; }

const SECRET_PATTERNS: SecretPattern[];
function redactSecrets(text: string, opts?: RedactOptions): RedactionResult;
function hasSecret(text: string, opts?: RedactOptions): boolean;
function redactionMarker(label: string): string; // `[REDACTED:${label}]`

MIT · part of the @broberg/* shared-library family.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@broberg/secret-scan

Usage

Two recommended integration shapes

Custom / per-tenant patterns

What it detects

Design notes

API