@sresarehumantoo/reaper
v0.1.1
Published
Dead code and obfuscation analyzer for JavaScript and TypeScript, with a hardened Docker sandbox for dynamic analysis.
Maintainers
Readme
reaper
Dead-code and obfuscation analyzer for JavaScript and TypeScript, with an optional hardened Docker sandbox for dynamic behavioral analysis of suspicious scripts.
Originally built to triage JS malware samples - packed payloads, eval layers, char-code arrays, base64 staging, smart-contract-hosted payloads - but it works just as well as a plain dead-code finder on regular source trees.
[!WARNING]
examples/contains real, live malware samples. The files underexamples/dom01/,examples/etherhiding/artifacts/,examples/deadcode01/, andexamples/deadcode02/are deobfuscation fixtures and reverse-engineering walkthroughs that ship inert payloads (.js,.b64,.hex,.txt) as data files. They will not execute unless you deliberately run them. Do notnode,bash, orevalany file underexamples/. Do not paste the contents of anyclipboard-payload.txtinto a shell.If you want a code-only clone with no payloads, use a sparse checkout:
git clone --filter=blob:none --no-checkout https://github.com/sresarehumantoo/reaper.git cd reaper git sparse-checkout init --cone git sparse-checkout set src scripts docker git checkoutSee
examples/etherhiding/README.mdfor the analysis walkthrough andexamples/etherhiding/REPORT.mdfor the full report.
What it does
Static analysis (Babel-based AST):
- Unused imports, variables, functions, and exports
- Unreachable code after
return/throw - Dead branches via constant folding (
if (false),1 === 2, etc.) - Obfuscation patterns:
eval,new Function,setTimeout("..."),atob,String.fromCharCode(...), bracket access to['eval']/['constructor'], high-entropy string literals, hex/unicode escape density - Cross-scope reachability: call-graph BFS from auto-detected or user-supplied entry points
- Eval-aware scope capture - intercepts
eval'd source and recursively analyses the inner layers p,a,c,k,e,rstatic unpack + string folding inside dead function bodies (recovers constant strings from code that won't run)- obfuscator.io string-array rewriter - detects the array-fn + decoder + IIFE-shuffle + wrapper-fn pattern (including nested wrappers), boots the decoder in a vm, inlines enclosing-scope const lookups, and substitutes every wrapper call with its plaintext string. Output is a fully rewritten
.deobf.js - XOR-loop decoder recovery - detects functions of the form
for (i) out += fromCharCode(s.charCodeAt(i) ^ k.charCodeAt(i % k.length))and, when callers pass string-literal arguments, statically recovers the plaintext into the finding - AAEncode/JJEncode detection - flags the katakana-heavy ASCII-art encoding family. Recovery requires execution; route through
scripts/analyze.shor--reachability - IOC extraction (
--iocs) - pulls URLs, bare domains, IPv4, EVM addresses, EVM function selectors, base64 blobs, high-entropy strings, and email addresses out of any input, with context hints (prop:data,arg-of:fetch,init:varName) so analysts see how each indicator is wired up
HTML / data-URI ingestion:
.htmlinputs are scanned for inline<script>blocks anddata:text/javascript;base64,...URIs; each script becomes a virtual sub-file the analyzers process independently- Common in real-world DOM dumps where the malicious payload is smuggled as a base64 data URI in a
<script src=...>
Dynamic analysis (Docker sandbox):
node:20-alpinecontainer, non-root uid 1001, all caps dropped,no-new-privileges--network none, 256 MB memory cap, 0.5 CPU, read-only FS,noexectmpfs- Pre-loaded monitoring shim logs
eval/new Function/setTimeout(string)calls,require()s, env-var access,fetch/net/http/fscalls as[REAPER]JSON lines on stderr - Hard wall-clock timeout,
child_process/cluster/worker_threadsblocked - Runtime modes:
--observe-network- real egress stays blocked but a stubfetch/httpresponder is installed so the script proceeds past network calls; you see the URL/method/body it would have used--block-eval-eval/Functionthrow after logging--block-fs- file-system writes throw after logging
Install
From npm (once published):
npm install -g @sresarehumantoo/reaper
reaper "src/**/*.js"From source (development):
make # installs deps, typechecks, compiles to dist/
# or run reaper from source without building:
npx tsx src/cli.ts <pattern>make help lists every available target. The common ones: make build, make typecheck, make sandbox (build the docker analysis image), make demo (deobfuscate the bundled EtherHiding fixture end-to-end), make ci (typecheck + artifact hash verification).
Usage
Static scan
# Default scan - all analyzers on
reaper "src/**/*.ts"
# Scan a captured DOM (extracts inline + data: URI scripts automatically)
reaper page.html
# JSON output
reaper malware.js --format json --output report.json
# Function inventory + reduction report
reaper packed.js --analyze
# Cross-scope reachability (eval-aware) with auto-detected entry points
reaper packed.js --reachability
# Reachability with explicit entry points
reaper malware.js --reachability --entry sendCode,init
# Disable specific analyzers
reaper "src/**/*.js" --no-obfuscation --no-dead-branchesExit code is non-zero when findings are present, so it composes with CI.
Deobfuscate (rewrite mode)
# HTML → b64 → obfuscator.io string-array deobfuscation, plaintext written to out/
reaper page.html --rewrite out/
# Same for raw JS
reaper obfuscated.js --rewrite out/For each input the rewriter reports how many wrapper calls were substituted; outputs are <name>.deobf.js (or <name>.js pass-through when no string-array pattern was detected).
Extract IOCs
# Indicators of compromise (URLs, domains, IPv4, EVM addresses + selectors,
# base64 blobs, high-entropy strings, emails), with context hints
reaper sample.js --iocs
# Machine-readable JSON for downstream pipelines
reaper sample.js --iocs --format json --output iocs.jsonFor best recall, run --rewrite first and then --iocs against the deobfuscated output — IOCs hidden behind a string-array decoder won't be visible in the raw form.
SARIF output (GitHub Code Scanning)
reaper "src/**/*.js" --format sarif --output reaper.sarifThe result is a SARIF 2.1.0 document. Upload it from a workflow with github/codeql-action/upload-sarif to get findings rendered in the repo's Code Scanning tab.
Full pipeline (static + sandboxed dynamic)
# Static-only
./scripts/analyze.sh suspicious.js --static-only
# Static + dynamic with default constraints (network blocked, hard timeout)
./scripts/analyze.sh malware.js --timeout 30 --output-dir ./reports
# Observe network attempts without real egress (logs URL/method/body of every fetch)
./scripts/analyze.sh etherhiding.js --observe-network --timeout 20
# Log what eval would have executed without actually running it
./scripts/analyze.sh packed.js --observe-network --block-evalThe pipeline script runs the static analyzer, then builds and runs the Docker sandbox image (reaper-sandbox:latest) against the target.
Examples
See the warning at the top of this README before opening files under examples/.
examples/etherhiding/- full walkthrough of an in-the-wild EtherHiding + ClickFix sample. The directory contains the minimal HTML fixture, every intermediate stage fetched from BSC testnet contract storage, both OS-specific clipboard payloads, the deobfuscated plaintext of every JavaScript stage, and a step-by-stepREADME.mdyou can follow with no network access. Seeexamples/etherhiding/REPORT.mdfor the analysis report and IOCs.examples/dom01/- original DOM dump (compromised WordPress page) from which the EtherHiding sample was extracted.examples/deadcode01/- real-world obfuscated sample (sendCode.js) plus its companion files. Good test for reachability and eval-layer capture.examples/deadcode02/- smallp,a,c,k,e,r-packed flag. Tryreaper examples/deadcode02/flag.js --reachability.
Quick end-to-end against the EtherHiding fixture:
# 1. Static deobfuscation - recovers the fetch URL, contract address, selector
reaper examples/etherhiding/sample.html --rewrite /tmp/out
# 2. Dynamic run with observe-network - captures the C2 request without egress
./scripts/analyze.sh examples/etherhiding/artifacts/stage1/payload.deobf.js \
--dynamic-only --observe-network --timeout 8
# 3. (Optional) Re-fetch the next stage from the live contract
./examples/etherhiding/fetch-evm-payload.mjs 0xA1decFB75C8C0CA28C10517ce56B710baf727d2e \
--out /tmp/dispatcher.jsExpected output from step 2 includes a [REAPER] {"category":"fetch","detail":{"url":"https://bsc-testnet-rpc.publicnode.com/","method":"POST", ...}} line containing the JSON-RPC eth_call body.
Project layout
src/
cli.ts # commander entrypoint
parser/
index.ts # @babel/parser wrapper
html.ts # .html input → extracted <script> / data: URI subfiles
analyzers/
imports.ts # unused imports
references.ts # unused vars / functions
unreachable.ts # code after return/throw
branches.ts # constant-folded dead branches
obfuscation.ts # eval, Function, atob, fromCharCode, entropy
reachability.ts # top-level cross-scope reachability analyzer
evalscope.ts # eval interception → captured inner-layer sources
packer.ts # p,a,c,k,e,r detection + static unpack
stringarray.ts # obfuscator.io string-array detect + static rewrite
strfold.ts # constant-string folding inside dead bodies
functions.ts # function metadata extraction
graph/
callgraph.ts # build call graph from AST
reachability.ts # BFS over the graph, entry-point detection
reporter/
console.ts # default human-readable output
json.ts # JSON output
analysis.ts # --analyze inventory report
reachability.ts # --reachability report
docker/
Dockerfile # hardened sandbox image
runner.js # --require shim - logs eval/fetch/fs/http, supports observe/block modes
scripts/
analyze.sh # combined static + dynamic pipeline
examples/ # sample inputs (incl. etherhiding/)Requirements
- Node.js 20+
- Docker (only required for the dynamic pipeline via
scripts/analyze.sh)
License
MIT. See LICENSE.
