kafkacode
v1.5.0
Published
Open-source, local-first privacy code scanner for PII leaks, hardcoded secrets, GDPR/CCPA compliance, SARIF, and CI/CD
Maintainers
Keywords
Readme
KafkaCode - Open-Source Privacy Code Scanner
Local-first PII scanner and secret detection CLI for source code, CI/CD, GDPR, CCPA, and SARIF workflows.
KafkaCode catches PII leaks, hardcoded secrets, and privacy compliance risks before they ship. One command gives you a clear A+ → F privacy grade, CI-ready exit codes, JSON/SARIF output, and optional BYO-key AI analysis.
📖 Documentation · Quickstart · Features · CI/CD · Contributing
Why KafkaCode?
Most scanners stop at "you leaked an AWS key." KafkaCode goes further — it grades how your code handles personal data, flags GDPR/CCPA risks, and catches hardcoded secrets with a local-first pattern scanner and an optional AI pass for the context that regex alone can't see.
You get one number a whole team understands — a privacy grade from A+ to F — plus a non-zero exit code that fails the build when something sensitive slips in.
npx kafkacode scan .No install. No signup. No config.
⚡ Quickstart
# Run it once, anywhere (no install)
npx kafkacode scan .
# Or install globally
npm install -g kafkacode
kafkacode scan ./src --verbose✨ Features
- 🔑 Secret detection — AWS & Stripe keys, private keys, high-entropy strings
- 🕵️ PII detection — emails, phone numbers, IP addresses
- 🛡️ Privacy compliance scanning — source-code checks for GDPR, CCPA, and data privacy risks
- 🤖 AI-powered analysis — contextual privacy issues a regex would miss
- 🎓 Privacy grade — a single, shareable A+ → F score
- 🏷️ Grade badge — drop your score into your README (
--badge) - ⚡ Fast & offline — pattern scanning needs no network
- 📄 SARIF & JSON output — integrate with GitHub code scanning and security dashboards
- 🧰 Config, ignores & baselines — adopt safely in existing repositories
- 🔒 Redacted output by default — prevent secrets from leaking into logs
- 🌐 7 languages — Python, JavaScript, TypeScript, Java, Go, Ruby, PHP
- 🚀 CI/CD ready — clean exit codes + a one-line GitHub Action
📊 Example output
🎯 PRIVACY SCAN REPORT
════════════════════════════════════════════════════════════════
📊 SCAN SUMMARY
📁 Directory: ./src
📄 Files Scanned: 18
🔍 Total Issues: 4
🏆 Privacy Grade: 🔴 F
🚨 Critical: 1 🔥 High: 1 ⚠️ Medium: 2 🔵 Low: 0
🚨 CRITICAL
┌─ AWS Access Key ID detected
│ 📍 src/config.js:12
│ 💡 Move credentials to environment variables or a secrets manager.
└─
⚠️ MEDIUM
┌─ Email address detected (PII)
│ 📍 src/users.js:47
│ 💡 Avoid hardcoding personal data; load it at runtime.
└─🏷️ Privacy grade & badge
KafkaCode distills every scan into one grade:
| Grade | Meaning | | :---: | ------- | | 🟢 A+ / A / A- | Excellent — no or only low-severity issues | | 🟡 B+ / B / B- | Good — a few medium-severity issues | | 🟠 C+ / C / C- | Needs attention — high-severity issues present | | 🔴 D / F | Critical privacy/secret exposure |
Show it off in your own README:
kafkacode scan . --badge🏷️ Privacy Grade Badge — paste into your README:
→
🚀 CI/CD integration
GitHub Action
# .github/workflows/privacy.yml
name: Privacy Scan
on: [push, pull_request]
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: nikhil-kapu/kafkacode@v1
with:
path: ./srcAny CI / pre-commit
# Exits non-zero when issues are found, failing the build
npx kafkacode scan ./src
# Fail only on high or critical findings
npx kafkacode scan ./src --fail-on high
# Generate SARIF for GitHub code scanning
npx kafkacode scan ./src --format sarif --output kafkacode.sarif --no-fail🔍 What it detects
| Severity | Examples |
| -------- | -------- |
| 🚨 Critical | AWS keys, Stripe live keys, private keys |
| 🔥 High | JWTs, password=, api_key=, token= and other secrets in assignments |
| ⚠️ Medium | Emails, phone numbers, high-entropy strings |
| 🔵 Low | IP addresses |
🧠 How it works
your code ─▶ FileScanner ─▶ ┌─ PatternScanner (regex, fully offline)
└─ LLMAnalyzer (optional AI context)
│
▼
ReportGenerator ─▶ grade + findings + exit codePattern-based detection runs entirely on your machine with no network calls. The optional AI layer adds contextual findings for the cases static rules can't catch.
🤖 AI mode (optional, bring-your-own-key)
Pattern scanning works out of the box with no setup and no network calls. To add AI-powered contextual findings, bring your own API key — KafkaCode calls an OpenAI-compatible chat API directly, defaulting to Groq (which has a free tier):
export KAFKACODE_API_KEY=your_key_here
kafkacode scan ./src| Variable | Default | Purpose |
| -------- | ------- | ------- |
| KAFKACODE_API_KEY | (unset) | Your provider API key — enables AI mode |
| KAFKACODE_API_URL | https://api.groq.com/openai/v1 | OpenAI-compatible base URL (Groq, OpenAI, OpenRouter, local models…) |
| KAFKACODE_MODEL | llama-3.1-8b-instant | Model name |
Without a key, KafkaCode runs pattern-only and never sends your code anywhere.
Pass --no-ai to force pattern-only even when a key is set.
🆚 How it compares
| | KafkaCode | gitleaks / trufflehog | semgrep | | ---------------------------- | :-------: | :-------------------: | :-----: | | Hardcoded secrets | ✅ | ✅ (deep, git log) | ➖ | | PII / personal-data findings | ✅ | ➖ | ➖ | | Privacy grade (A+ → F) | ✅ | ➖ | ➖ | | AI contextual analysis | ✅ | ➖ | ➖ | | SARIF output | ✅ | ➖ | ✅ | | Zero-config, one command | ✅ | ✅ | ➖ |
KafkaCode focuses on privacy and developer-friendly grading — it complements deep secret scanners rather than replacing them.
📚 Guides
- PII scanner for source code
- Secret scanning in CI/CD
- GDPR code scanning
- SARIF privacy scanner
- Local-first privacy scanner
🗺️ Roadmap
- [x] Bring-your-own-key AI — call Groq / OpenAI-compatible providers directly
- [x]
--json& SARIF output — SARIF integrates with the GitHub Security tab - [x] Config file &
.kafkacodeignore - [x] Baseline file to adopt on existing codebases
- [x] More file types (
.env, YAML, Terraform, Dockerfiles) - [x] Redacted snippets by default, with
--show-secretsopt-in - [ ] Provider validation for selected secret types
- [ ] More language-aware privacy rules
Ideas and PRs welcome — see CONTRIBUTING.md.
🤝 Contributing
Contributions of all kinds are welcome — bug reports, new detection patterns, and docs. Start with CONTRIBUTING.md, and please report security issues per our Security Policy.
📄 License
MIT © KafkaLabs
