json-shredder
v0.1.0
Published
Reduce LLM token tax by compacting, trimming, and redacting JSON safely.
Maintainers
Readme
json-shredder
Reduce token tax in AI development by shrinking JSON payloads safely.
json-shredder preprocesses JSON for LLM prompts by compacting structure, trimming low-value content, and redacting sensitive fields before data ever reaches a model.
Why this package exists
Teams often send raw logs, API responses, and nested objects directly to LLMs. That increases:
- Token cost
- Latency
- Context window pressure
- Risk of leaking secrets
json-shredder addresses all four by default.
Research basis
Design decisions for this package were informed by:
- OpenAI token guidance: token usage drives cost and context limits; rough rule is 1 token is approximately 4 characters.
- OpenAI token-counting cookbook: practical token estimation and message-overhead awareness.
- Anthropic prompt engineering docs: establish measurable success criteria and iteratively optimize prompts.
- OWASP logging guidance: never log secrets or access tokens directly; prefer masking/redaction.
Install
npm install json-shredderQuick start
const { shredJson } = require("json-shredder");
const payload = {
event: "checkout",
authorization: "Bearer sensitive",
user: {
id: "u_42",
bio: "A very long profile string ...",
},
logs: ["...many lines..."],
};
const result = shredJson(payload, {
maxStringLength: 140,
maxArrayItems: 20,
dropPaths: ["logs"],
keepPaths: ["event", "user.id"],
});
console.log(result.json);
console.log(result.report);CLI usage
Shred a file:
json-shredder shred --in payload.json --out payload.shred.json --reportShred from stdin:
cat payload.json | json-shredder shred --reportEstimate token savings:
json-shredder estimate --in payload.jsonWhat gets optimized
- Key sorting for stable, compact JSON output.
- String truncation with explicit omission markers.
- Array trimming with omitted item summary marker.
- Depth limiting to prevent oversized nested payloads.
- Key/path filtering to keep only prompt-relevant fields.
- Secret redaction by default key set plus custom keys/paths.
Default security behavior
The following key names are redacted by default (case-insensitive):
- access_token
- api_key
- apikey
- authorization
- cookie
- id_token
- password
- refresh_token
- secret
- set-cookie
- token
Replacement value is [REDACTED] by default.
API
shredJson(input, options)
Returns:
- data: shredded JSON object
- json: serialized JSON string
- stats: before/after and operation counters
- report: human-readable summary
Options:
- maxArrayItems (default 24)
- maxDepth (default 8)
- maxStringLength (default 240)
- keepKeys (array)
- removeKeys (array)
- redactKeys (array)
- keepPaths (array of dot paths)
- dropPaths (array of dot paths)
- redactPaths (array of dot paths)
- removeEmptyStrings (boolean)
- removeNulls (boolean)
- dropEmptyObjects (default true)
- appendArraySummary (default true)
- sortKeys (default true)
- pretty (boolean)
- redactReplacement (string)
Path wildcard support:
*matches any single segment.- Example:
events.*.raw.
shredToString(input, options)
Shortcut that returns only shredded JSON string.
estimateTokens(input)
Character-based estimate using ceil(chars / 4).
Use model-specific tokenizers for strict accounting in production billing systems.
Token-reduction workflow recommendation
- Keep only fields needed for the model task.
- Drop high-volume debug/raw data by path.
- Redact secret and regulated fields.
- Truncate long strings and cap arrays.
- Track before/after estimate in CI to prevent prompt bloat regressions.
Security notes
- Redaction is best-effort and key/path based; validate against your own data classification rules.
- Avoid passing untrusted shredded output into code-execution contexts.
- Keep logs and prompt artifacts access-controlled.
- Review SECURITY.md for threat model and disclosure policy.
Development
Run tests:
npm testRun audit:
npm run auditCheck publish contents:
npm run pack:checkPublish:
npm publish --access publicLicense
MIT
