opencode-data-size-guardrail
v0.1.1
Published
OpenCode plugin that blocks accidental huge data reads and exports.
Downloads
272
Maintainers
Readme
opencode-data-size-guardrail
🤯 Your Agent Ran cat huge.json And Your Tokens Died
Your agent runs this:
cat token_flow_analysis.jsonLooks harmless.
But the file is this big:
218.5 MB
~54,625,000 tokensNow your LLM is slow, confused, and expensive.
opencode-data-size-guardrail stops that.
It is a tiny OpenCode plugin that warns or blocks when an agent tries to read, print, or create giant raw data files.
Install
Install the plugin where OpenCode can resolve it:
cd ~/.config/opencode
bun add opencode-data-size-guardrailOpen:
~/.config/opencode/opencode.jsoncAdd:
{
"$schema": "https://opencode.ai/config.json",
"plugin": [
"opencode-data-size-guardrail"
]
}Restart:
opencodeThat is it.
Use opencode.jsonc for your personal/global config. Use opencode.json for project/shared config.
Do not use bun add -g for OpenCode plugins unless you know your OpenCode install resolves global Bun packages. The reliable setup is installing the package in ~/.config/opencode.
The One Rule
Big raw files can live on disk.
They should not go into the LLM.
Bad:
raw data -> giant JSON -> LLM reads giant JSON -> token explosionGood:
raw data -> local script -> small summary -> LLMThe script does the heavy lifting. The LLM reads the tiny result.
What It Does
Reads:
| Size | Behavior |
| ---: | --- |
| > 5 MB | warn |
| > 20 MB | soft-block |
| > 100 MB | hard-block |
Generated files:
| Size | Behavior |
| ---: | --- |
| > 20 MB | warn and record |
| > 100 MB | mark dangerous and block future reads |
Soft-block means: stop by default, but show the exact override command.
Real Example
This plugin came from a real OpenCode mistake.
An agent analyzed Marvin/MCP sessions, fetched full token values, and wrote this file:
token_flow_analysis.json 218.5 MB ~54,625,000 tokensThe problem was not running a script.
The problem was creating a giant raw file that the LLM might read next.
The safe version should have been:
MCP sessions -> local aggregation -> summary.md -> LLMThis plugin catches the same mistake for logs, JSON, JSONL, NDJSON, CSV, AWS exports, curl downloads, MCP data, custom scripts, and generated reports.
What The Agent Sees
Blocked by opencode-data-size-guardrail.
This file is too large to read safely:
token_flow_analysis.json
Size: 218.5 MB
Estimated tokens: ~54,625,000
Use a safer workflow:
- sample the file
- aggregate locally
- extract only required fields
- generate a small summary JSON/Markdown fileFor soft blocks, it also shows how to continue:
OPENCODE_DSG_ALLOW_LARGE_FILES=true opencodeor:
OPENCODE_DSG_MAX_READ_BYTES=209715200 opencodeBad Commands
These are risky:
cat token_flow_analysis.json
cat *.jsonl
cat *.ndjson
cat app.log
jq . giant-file
grep ERROR huge.logThese are risky if they have no limit/sample/filter/summary:
node collect_token_flows.mjs
python collect_token_flows.py
node dump_sessions.mjs
python export_events.py
aws s3 cp s3://bucket/events.jsonl ./events.jsonl
aws s3 sync s3://bucket ./data
curl https://example.com/events.jsonl > events.jsonl
wget https://example.com/events.jsonl -O events.jsonl
mcp export everythingGood Commands
Do this instead:
cat token_flow_analysis.json | head -n 50
grep -m 20 ERROR app.log
grep --max-count=20 timeout app.log
jq '{count: length, sample: .[:20]}' data.json > summary.json
node scripts/summarize-events.js events.jsonl > summary.md
python scripts/profile_csv.py huge.csv > profile.md
node collect_token_flows.mjs --limit 10 --summaryGood files for agents:
summary.mdsample.jsonstats.jsontop-errors.mdschema-summary.md
Config
You probably do not need config.
| Variable | Default | Meaning |
| --- | ---: | --- |
| OPENCODE_DSG_WARN_READ_BYTES | 5242880 | warn over 5 MB |
| OPENCODE_DSG_ASK_READ_BYTES | 20971520 | soft-block over 20 MB |
| OPENCODE_DSG_MAX_READ_BYTES | 104857600 | hard-block over 100 MB |
| OPENCODE_DSG_WARN_GENERATED_BYTES | 20971520 | warn/record generated files over 20 MB |
| OPENCODE_DSG_MAX_GENERATED_BYTES | 104857600 | dangerous generated files over 100 MB |
| OPENCODE_DSG_ALLOW_LARGE_FILES | false | warn only, never block |
Examples:
OPENCODE_DSG_MAX_READ_BYTES=200mb opencode
OPENCODE_DSG_ALLOW_LARGE_FILES=true opencodeGenerated large files are recorded in:
.opencode-data-size-guardrail.jsonLimits
- It is heuristic, not a full shell parser.
- Weird commands can slip through.
- Safe commands can sometimes be blocked.
- No dashboard. No database. No external service.
The goal is simple: stop obvious token disasters.
Development
bun install
bun test
bun run build
bun run typecheckPublish
Before publishing, verify and build:
bun test
bun run typecheck
bun run build
npm publish --access publicAfter publishing, reinstall the npm package for OpenCode:
cd ~/.config/opencode
bun remove opencode-data-size-guardrail
bun add opencode-data-size-guardrail@latestThen restart OpenCode so it reloads the plugin.
