logstrip
v1.10.0
Published
Stream-based CLI that trims noisy server logs, build pipelines, vulnerability scanners, and container workloads down to the diagnostic context LLM agents actually need.
Downloads
1,744
Maintainers
Readme
LogStrip
compress noisy logs before they poison your LLM context
A zero-dependency Node.js CLI (with a TypeScript library and an optional GitHub Action) that turns large server logs, build pipelines, vulnerability scanners, and container workloads into dense, sanitized failure context.
Install · Quick Start · Benchmarks · vs Alternatives · Agents · How It Works · Custom Config · CLI · Library · GitHub Action · Docs
You paste a 50k-line CI log into your agent. It chews 200k+ tokens on noise - health checks, framework internals, repeated stack frames, UUIDs - and still misses the one [ERROR] line that matters. LogStrip trims that to the diagnostic context an LLM actually needs. One command. Zero dependencies. Streaming - never loads the full log into memory.
What changes: Session 1 your build fails with a flaky test. You feed the log through logstrip. Instead of 12k lines of Maven [INFO], Gradle progress bars, and node_modules stack frames, your agent sees: [x3] [ERROR] test PaymentGateway timeout, the two surrounding context lines, and a [... hidden internal library frames ...] marker. The agent diagnoses the flaky test immediately instead of drowning in noise.
Install
Requires Node.js 20 or newer.
npm install --global logstripOr run without installing:
npx -y logstrip raw.log -o clean.logQuick Start
# File in, file out
logstrip raw.log -o clean.log
# Unix pipe (stdin → stdout)
cat raw.log | logstrip > clean.log
# File in, stats to stderr, compressed log to stdout
logstrip raw.log --stats > clean.logQuick example
Raw input:
[INFO] boot ok
[ERROR] request 123e4567-e89b-12d3-a456-426614174000 failed
[ERROR] request 987e6543-e21b-42d3-b456-526614174111 failed
at lib (/repo/node_modules/pkg/index.js:1:1)
LogStrip output:
[x2] [ERROR] request [ID] failed
[... hidden internal library frames ...]
Compression ratios from the 41 fixture test suite across real-world log sources:
| Category | Sample fixture | Input lines | Output lines | Token savings |
|:---|:---|---:|---:|---:|
| CI platforms | ci-platforms.log | 57 | 18 | 68% |
| Build toolchains | build-toolchains.log | 70 | 23 | 67% |
| Security scanners | security-scanners.log | 88 | 25 | 72% |
| Infra / Cloud | terraform-ansible-systemd.log | 41 | 14 | 66% |
| Serverless | cloud-serverless.log | 32 | 11 | 66% |
| Java enterprise | java-enterprise.log | 92 | 28 | 70% |
| AI/ML ecosystem | ai-ml-ecosystem.log | 85 | 22 | 74% |
| Web frameworks | web-frameworks-node.log | 68 | 20 | 71% |
| Docker build | docker-build.log | 54 | 16 | 70% |
| Kubernetes | kubernetes-crashloop.log | 47 | 12 | 74% |
Production logs with millions of lines routinely hit 80%+ token savings because noise ratios scale with log size.
Full fixture catalogue:
tests/fixtures/- 41.logfiles covering 705+ ecosystem signatures. Each fixture has a committed snapshot baseline.
| | LogStrip | grep -v / awk | LLM summarization | logreduce |
|:---|:---|:---|:---|:---|
| Type | Streaming log compressor | Line filter | API call + prompt | ML-based anomaly detector |
| Token savings | 80%+ (typical CI logs) | 20-40% (fragile patterns) | 60-80% (expensive, lossy) | ~50% (anomaly-only) |
| Streaming | Yes (readline, bounded memory) | Yes (pipe) | No (buffer entire log) | No (batch) |
| Deduplication | Smart [xN] folding with delta values | No | Approximate | No |
| Sanitization | UUIDs, IPs, timestamps, AWS keys, JWTs, GitHub tokens, Slack tokens, connection strings, Authorization: headers | Manual regex | Unreliable | Partial |
| Stacktrace collapse | Internal node_modules → single marker | No | Often drops context | No |
| Runtime deps | 0 (node:* built-ins only) | 0 | Heavy (API + tokens) | Python + ML stack |
| LLM cost | $0 (pure computation) | $0 | $0.01-$1.00+ per log | $0 (compute only) |
| Extensible | .logstrip.yml custom config | Shell scripts | Prompt engineering | Plugin system |
| CI integration | CLI + GitHub Action | Shell scripts | API wrapper | CLI |
LogStrip ships agent plugin bundles so assistants compress logs before
diagnosing them. The workflow is the same everywhere: run logstrip, analyze the
.logstrip.log, then report token savings from --stats.
| | | | | | |
|:---:|:---:|:---:|:---:|:---:|:---:|
|
Claude Codehooks + agents + skill |
Codex CLIhooks + skill + AGENTS.md |
Factory Droiddroids + skill + command |
Cursorrules + hooks |
GitHub Copilotmarketplace plugin |
OpenCodeskill + /logstrip command |
| AiderCLI pipe |
ClineCLI pipe |
Gemini CLICLI pipe |
WindsurfCLI pipe |
Roo CodeCLI pipe | Any agentCLI / stdin pipe |
Works with any agent that can run a shell command or read a file. One binary, compressed output shared across all of them.
How the agent plugin works
The plugin bundles (plugins/logstrip/) ship hooks, skills, and agents for
each supported platform. The core mechanism is the same everywhere:
1. PreToolUse hook - auto-compress on Read
When the agent is about to read a .log, .out, .txt, .trace, or .err
file, the hook intercepts the Read call, runs logstrip <file> -o
<file>.logstrip.log, and denies the raw read - instead redirecting the
agent to the compressed .logstrip.log file. Raw bytes never enter the
context window. Already-compressed files (.logstrip.log) and non-log
extensions (.ts, .py, .json, etc.) are skipped automatically.
2. UserPromptSubmit hook - detect pasted logs
When the user pastes log-like output into the chat (timestamps, log levels,
stack traces, CI markers - needs 2+ heuristic matches across 5+ lines), the
hook injects additionalContext telling the agent to write the paste to a temp
file and run logstrip before analysing, rather than reading the raw paste
line-by-line.
3. Skills, commands, and agents
| Component | Platform(s) | Purpose |
|:---|:---|:---|
| /logstrip skill | Claude, Codex, Droid, OpenCode | Invoked when the user asks to compress, trim, or prepare a log for analysis. |
| /logstrip command | Droid, OpenCode | Slash command: logstrip <input> [--output ...] [--aggressiveness ...]. |
| logstrip-reviewer agent / droid | Claude, Droid | Reviews diffs against LogStrip coding standards and the 100% coverage gate. |
| logstrip-fixture-author agent / droid | Claude, Droid | Generates realistic CI log fixtures and wires them into smoke tests. |
| logstrip.mdc rule | Cursor | Activates on **/*.log globs. |
| logstrip-paste-detect.mdc rule | Cursor | Always-on rule that detects pasted log output. |
| copilot-instructions.md | Copilot | Top-level custom instructions for log-aware behaviour. |
| logstrip.instructions.md | Copilot | File-scoped instructions (applyTo: **/*.log). |
| logstrip.prompt.md | Copilot | Agent-mode prompt for compress-and-diagnose workflow. |
| AGENTS.md | Codex, OpenCode | Project-level agent instructions for log handling. |
4. Per-agent plugin manifests
| Agent | Manifest | Components |
|:---|:---|:---|
| Claude Code | plugins/logstrip/.claude-plugin/plugin.json | hooks.json (PreToolUse + UserPromptSubmit), agents, commands, skill |
| Factory Droid | plugins/logstrip/.factory-plugin/plugin.json | Droids, skill, command |
| Codex CLI | plugins/logstrip/.codex-plugin/plugin.json | hooks.json (PreToolUse + UserPromptSubmit), skill, AGENTS.md |
| Cursor | plugins/logstrip/.cursor-plugin/plugin.json | cursor-hooks.json (PreToolUse + UserPromptSubmit), rules (logstrip + paste-detect) |
| GitHub Copilot | plugins/logstrip/.github/plugin.json | hooks.json (PreToolUse + UserPromptSubmit), agents, commands, skill, copilot-instructions.md, instructions/, prompts/ |
| OpenCode | plugins/logstrip/opencode/.opencode/ | AGENTS.md, skill, /logstrip command |
See the Agent Plugin Installation guide for per-agent setup.
Installing the Copilot marketplace plugin
LogStrip publishes a Copilot agent plugin that bundles hooks, skills, agents,
instructions, and prompts into a single installable package. It works in both
VS Code and the GitHub Copilot CLI (gh copilot).
VS Code - add the LogStrip marketplace to your settings:
{
"chat.plugins.enabled": true,
"chat.plugins.marketplaces": ["mrwogu/logstrip"]
}Then browse plugins with @agentPlugins in the Extensions view, or VS Code
will discover LogStrip automatically on next startup.
Copilot CLI - install directly:
gh copilot plugin install mrwogu/logstrip:plugins/logstripOnce installed, Copilot will auto-compress .log files on read and detect
pasted log output - the same PreToolUse and UserPromptSubmit hooks that work
in Claude Code and Cursor.
LogStrip detects 705+ log ecosystems (matched in a single pass with an Aho-Corasick automaton) and applies several cuts to every streamed line:
| Cut | What it does |
|:---|:---|
| Defoliation | Drops [INFO], [DEBUG], [TRACE], and [VERBOSE] lines. |
| Sanitization | Replaces UUIDs, timestamps, IPs, AWS keys, GitHub tokens, JWTs, Slack tokens, connection string passwords, Authorization: headers, and long hashes with compact placeholders. Groups HTTP status codes (503 → [5xx]). |
| Context scoring | Keeps high-signal diagnostics and nearby context while dampening repeated spam (TF-IDF). |
| Smart dedup | Folds repeated sanitized lines, including same-shape variants, into [xN] message with only differing values listed. |
| Stack collapse | Replaces internal node_modules/ and runtime frames with one marker line. |
| Stack-window collapse (auto) | Folds repeated multi-line stack traces that differ only in addresses, Go offsets, or goroutine ids into a single [xN] group. |
| Root-cause pruning (auto) | Drops downstream cascade restatements (aborting due to previous errors, could not compile … due to previous error, skipped because the upstream job failed) so the originating failure stands out. |
| Multilingual detection (auto) | Recognizes error/failure/exception keywords in 8+ languages plus CJK (erreur, Fehler, fallo, ошибка, 错误, …). |
| Format voting (auto) | Locks the fast path onto the first recognizable line, then self-corrects the detected format with a majority vote over the first 50 non-blank lines. |
| Instance-counter folding | Folds enumerated counters (worker [1 \| 2 \| 3], retry …) into the repeat signature; labels whose numbers carry meaning (error/code/status/exit) are excluded. |
| Multiline joining | Joins indented continuation lines (Python tracebacks, Node stack frames, Java Caused by: chains, Go goroutine dumps) with their parent into a single logical line. |
| Severity filtering | Drops lines below a configurable minimum severity (fatal / error / warn / info / debug / trace). |
| CI noise filters | Drops progress bars, timestamp-only lines, K8s Normal events, and rate-limited repetition messages. |
Cuts marked (auto) are enabled by default in auto mode; disable any of them
with the matching --no-* flag.
[INFO] boot ok ← dropped (noise tag)
[ERROR] request 123e4567-...-426614174000 failed ← kept + sanitized
[ERROR] request 987e6543-...-526614174111 failed ← kept + sanitized → folded
[ERROR] charge failed id=018f23ab-... amount=99.99 ← kept + sanitized → dedup group
[ERROR] charge failed id=018f23ab-... amount=49.50 ← kept + sanitized → dedup group
[ERROR] charge failed id=018f23ab-... amount=12.00 ← kept + sanitized → dedup group
at lib (/repo/node_modules/pkg/index.js:1:1) ← internal stack → markerbecomes:
[x2] [ERROR] request [ID] failed
[x3] [ERROR] charge failed id=[ID] amount=[99.99 | 49.50 | 12.00]
[... hidden internal library frames ...]See the full source catalogue for all 705+ detected ecosystems.
Corporations and teams running internal tools can extend LogStrip
without modifying source code. Create a .logstrip.yml file (or pass
--config path/to/config.yml) to define custom log sources, diagnostic
patterns, ignore rules, sanitization rules, and internal stack patterns
that merge with the built-in set at runtime.
# .logstrip.yml - Acme Corp CI extension
sources:
- name: acme-ci
markers: [acme-ci-runner, "[ACME-CI]"]
diagnosticPatterns:
- "ACME_BUILD_FAILED"
- "ACME_TEST_TIMEOUT"
ignorePatterns:
- "\\bacme-ci heartbeat\\b"
sanitizePatterns:
- pattern: "ACME-EMP-\\d{6}"
replacement: "[ACME-EMP]"
- pattern: "acme-tenant/[a-z0-9-]+"
replacement: "acme-tenant/[ID]"
flags: "gi"
internalStackPatterns:
- "/opt/acme/ci-runner/"How it works:
- Auto-detection - When
--configis not provided, the CLI looks for.logstrip.ymlin the current working directory. - Merging - Custom sources with a name that already exists in the
built-in set (e.g.
docker) have their markers merged. New names are appended. - Order of application - Custom ignore patterns are checked before built-in noise-tag filtering. Custom sanitize rules run after built-in sanitization. Custom diagnostic patterns add +50 to the relevance score. Custom internal-stack patterns are checked alongside built-in ones.
- Zero new runtime dependencies - The YAML subset parser is
built into
logstrip-config.tsand handles mappings, sequences, inline arrays, quoted and unquoted strings, and comments. It does not requirejs-yamlor any external package.
Then simply run:
logstrip ci-output.log -o clean.log # .logstrip.yml auto-detected
logstrip ci-output.log -o clean.log --config /etc/logstrip/acme.yml # explicitFull config reference: CLI docs - Custom configuration
CLI
LogStrip is primarily a CLI tool. The logstrip binary is the sole
entry point - install globally and call it directly.
Usage: logstrip [INPUT] [options]
Arguments:
INPUT Path to the raw log. When omitted, reads from stdin.
Options:
-o, --output <path> Write the compressed log to <path>. Defaults to stdout.
-a, --aggressiveness <l> Compression preset: low | medium | high | aggressive | auto (default: auto).
-m, --multiline <mode> Join multiline logs: auto | python | node | java | go | rust | off (default: off).
--severity <level> Minimum severity: fatal | error | warn | info | debug | trace.
--include <regex> Keep only lines matching this regex.
--exclude <regex> Drop lines matching this regex.
--sample <N> Limit output to first N kept lines.
--max-tokens <N> Trim output to at most N tokens, keeping the highest-scoring lines (LLM context-budget mode).
--dedupe-window <N> Collapse non-adjacent duplicate lines seen within the last N distinct lines. Default: 1 (adjacent only).
--format-sample <N> Majority-vote format detection window over the first N non-blank lines. Default: 50.
--collapse-blocks <N> Collapse consecutive repeats of a block of up to N lines into one copy plus a [block xM] marker.
--no-collapse-stacks Disable auto-collapsing of repeated stack-trace windows that differ only in addresses/offsets (auto on).
--no-root-cause Disable auto-pruning of downstream cascade restatements (auto on).
--no-multilingual Disable auto-detection of non-English error/failure keywords (auto on).
--no-adaptive-context Disable auto-mode adaptive context windows around errors (auto on).
--preserve-id-suffix <N> Keep the last N chars of redacted UUIDs/hashes (0-16, default 0).
--max-line-length <n> Truncate lines longer than n chars. Default: 100000.
--timeout <s> Stop processing after s seconds.
--progress Show progress bar (file input only, requires --output).
--config <path> Path to .logstrip.yml config file. Auto-detects from cwd.
--telemetry Show cumulative telemetry summary on stderr and exit.
-s, --stats Print compression statistics to stderr.
-j, --json Print LogStripResult as JSON to stdout. Requires --output.
-h, --help Show help text and exit.
-v, --version Print the CLI version and exit.In the default auto mode the detection and compression boosters
(--collapse-stacks, --root-cause, --multilingual, adaptive context
windows, and majority-vote format detection) are on automatically - the
--no-* flags above are opt-outs for when you want the raw, unboosted pass.
Recipes
# 1. File in, file out
logstrip raw.log -o clean.log
# 2. Pipe stdin to stdout (Unix-style)
cat raw.log | logstrip > clean.log
# 3. File in, stats to stderr while content streams to stdout
logstrip raw.log --stats > clean.log
# 4. Programmatic report - compressed log to file, JSON report to stdout
logstrip raw.log -o clean.log --json
# 5. Custom config for internal tools
logstrip raw.log -o clean.log --config .logstrip.yml
# 6. Join Python tracebacks into logical lines
logstrip traceback.log -m python -o clean.log
# 7. Keep only error+fatal lines
logstrip raw.log --severity error -o clean.log
# 8. Suppress download noise in build logs
logstrip build.log --exclude 'Downloading|Extracting' -o clean.log
# 9. Preview first 50 significant lines of a huge log
logstrip huge.log --sample 50 -o preview.log
# 10. CI time budget - stop after 30 seconds
logstrip raw.log --timeout 30 -o clean.logPowerShell equivalents:
Get-Content raw.log | logstrip > clean.log
logstrip raw.log --stats -o clean.log; Get-Content clean.log -Tail 20Exit codes
| Code | Meaning |
|:--:|:---|
| 0 | success |
| 1 | runtime failure (I/O error, stream error) |
| 2 | usage error (bad flag, unsupported aggressiveness, --json without --output, stdin is a TTY) |
Library
The compiled package also ships the TypeScript core for embedding directly in your own Node tooling.
import { processLogFile, type LogStripResult } from 'logstrip';
const result: LogStripResult = await processLogFile('raw.log', 'clean.log', {
aggressiveness: 'auto',
multiline: 'python',
severity: 'error',
});
console.log(`saved ${result.savedTokens} tokens (${result.savingsPercent}%)`);processLogStream(input, output, options) is also exported for non-file streams
(stdin, network sockets, custom transforms). LogStripOptions groups into:
- Filtering & limits —
include,exclude,sampleSize,maxLineLength,maxTokens - Dedupe & folding —
dedupe,dedupeWindow,collapseBlocks - Context & sanitization —
contextBefore,contextAfter,preserveIdSuffix - Auto-mode boosters (default-on; set
falseto disable) —collapseRepeatedStacks,rootCause,multilingual,adaptiveContext - Detection & output —
formatDetectionSampleSize,outputFormat - Custom config —
config(inline) orconfigPath(path to.logstrip.yml) - Programmatic hooks —
signal(AbortSignal),onDecision(per-line decision callback),tokenEstimator(custom token counter)
See the core reference for the full option table.
For time-bounded processing use processLogStreamWithTimeout, which sets
result.timedOut = true when the deadline is reached.
GitHub Action
The repository also ships an optional GitHub Action that wraps the same parser. It is useful when you want a single step in CI and a tidy Step Summary, but the CLI is the primary distribution channel.
Dogfooding: This project uses its own action in its CI pipeline to compress fixture logs and render token savings in every workflow run.
- name: Compress logs with LogStrip
uses: mrwogu/logstrip@v1
id: logstrip
with:
log-path: raw_logs.txt
aggressiveness: auto
- name: Send compact logs to your AI agent
run: your-agent analyze --file "${{ steps.logstrip.outputs.output-path }}"See the GitHub Action reference for inputs, outputs, and the Step Summary contract.
Documentation
| Resource | Description |
|:---|:---|
| Getting Started | Install the CLI and trim your first log. |
| CLI Reference | Flags, exit codes, recipes, and --config docs. |
| Core API | TypeScript parser API for library use. |
| GitHub Action | Optional CI wrapper around the CLI core. |
| Agent Plugins | Claude Code, Droid, Copilot, Cursor, Codex, and OpenCode bundles. |
| Source Catalogue | All 705+ detected log ecosystem signatures. |
| Security | Sanitization and safe log handling notes. |
Local Development
npm install
npm run typecheck
npm run test:coverage # 100/100/100/100 gate
npm run build