comprexia
v0.0.4
Published
High-performance HTTP compression addon with new algorithm (prototype)
Maintainers
Readme
Comprexia
High‑performance HTTP compression for modern APIs and real‑time backends, with a JSON‑aware codec, Node.js native bindings, and a lightweight browser decoder. As principal architects, we design compression systems around the realities of web payloads: repetitive keys, structured layouts, and latency over perfect minimum size. Comprexia is engineered to be operationally simple, streaming‑friendly, and flexible enough to serve both high‑throughput and higher‑ratio requirements.
Why Another Codec
- Most web payloads are JSON. General‑purpose codecs (gzip, Brotli, Zstd) excel universally but don’t exploit JSON’s structure unless you pre‑condition data (dictionary, transforms). Comprexia integrates structure awareness directly.
- Latency matters more than extreme ratio for many APIs. A codec that returns in sub‑10 ms on typical responses with consistent decode is often better than one that shaves a few extra kilobytes.
- Streaming and negotiation must be straightforward. Comprexia’s stream format is simple, and the middleware negotiates
Accept-Encodingseamlessly.
Comprexia isn’t meant to replace gzip/Brotli/Zstd outright. It’s a practical alternative to combine compression wins and operational simplicity for JSON‑heavy workloads.
Core Concepts
- JSON‑aware preprocessing: Interns common keys/tokens and stabilizes structure to improve repetition. This yields better matches and lowers header overhead compared to naive LZ on raw JSON.
- Two encoder levels:
fast: prioritizes throughput with a single‑position hash finder and short match extension. It’s the default for web APIs.advanced: searches longer matches and uses extended match headers to reduce overhead on large repeats. It achieves better ratios at increased CPU.
- Stream format:
- Literal block:
header<0x80followed byheaderliteral bytes - Match block:
0x80|(len-3)followed by 2‑byte distance - Extended match block:
0xFFfollowed by 2‑byte length and 2‑byte distance (for longer matches)
- Literal block:
- Browser decode: A tiny JS decoder reproduces the stream expansion client‑side. For
fast, you get fully correct JSON without a server decode route.
Install
npm install comprexiaRequirements: Node 18+, C++20 compiler, CMake. The native addon builds automatically during install.
Quick Start
import { compress, decompress } from 'comprexia'
const src = Buffer.from('hello world')
const c = compress(src)
const d = decompress(c)
console.log(d.toString())Integration Patterns
Express Middleware (Recommended)
import express from 'express'
import { createComprexiaMiddleware } from 'comprexia'
const app = express()
// Default to fast for latency; opt into advanced where ratio matters
app.use(createComprexiaMiddleware({ level: 'fast' }))
app.get('/api/posts', (_req, res) => {
res.json({ success: true, data: [{ id: 1, title: 'hello' }] })
})
app.listen(3001)When clients send Accept-Encoding: cx, responses include:
Content-Encoding: cxX-Compression-RatioX-Original-SizeX-Compressed-Size
Fastify
import Fastify from 'fastify'
import { createComprexiaMiddleware } from 'comprexia'
const app = Fastify()
// Wrap composable middleware
app.addHook('onRequest', (req, reply, done) => done())
app.addHook('preHandler', (req, reply, done) => done())
// Use Comprexia in reply decorator
app.decorateReply('cxJson', function (payload: any) {
const enc = (this.request.headers['accept-encoding'] || '') as string
if (enc.includes('cx')) {
const s = JSON.stringify(payload)
const { compressFast } = require('comprexia')
const out = compressFast(Buffer.from(s))
this.header('Content-Encoding', 'cx')
this.header('Content-Type', 'application/json')
return this.send(out)
}
return this.send(payload)
})
app.get('/api/posts', async (_req, reply) => {
return (reply as any).cxJson({ success: true, data: [] })
})
app.listen({ port: 3002 })NestJS
import { Injectable, NestMiddleware } from '@nestjs/common'
import { compressFast, negotiateEncoding } from 'comprexia'
@Injectable()
export class ComprexiaMiddleware implements NestMiddleware {
use(req: any, res: any, next: () => void) {
const originalJson = res.json.bind(res)
res.json = (body: any) => {
const enc = negotiateEncoding(req.headers['accept-encoding'] || '')
if (enc === 'cx') {
const s = JSON.stringify(body)
const out = compressFast(Buffer.from(s))
res.setHeader('Content-Encoding', 'cx')
res.setHeader('Content-Type', 'application/json')
return res.send(out)
}
return originalJson(body)
}
next()
}
}WebSocket / Streaming
const { negotiateEncoding, createCompressorStream } = require('comprexia')
// Express chunked response
app.get('/events', (req, res) => {
const enc = negotiateEncoding(req.headers['accept-encoding'])
if (enc === 'cx') {
res.setHeader('Content-Encoding', 'cx')
const c = createCompressorStream()
c.pipe(res)
c.write(Buffer.from(JSON.stringify({ type: 'init' })))
setInterval(() => c.write(Buffer.from(JSON.stringify({ type: 'tick', t: Date.now() }))), 1000)
} else {
res.write(JSON.stringify({ type: 'init' }))
setInterval(() => res.write(JSON.stringify({ type: 'tick', t: Date.now() })), 1000)
}
})Frontend Decode (No Server Decode Route)
Use the provided browser decoder for cx payloads in the client.
import axios from 'axios'
import { decompressToString } from 'comprexia/web/decoder'
const api = axios.create({ baseURL: '/api', headers: { 'Accept-Encoding': 'cx' } })
async function fetchJson<T>(path: string): Promise<T> {
const res = await api.get(path, { responseType: 'arraybuffer' })
if (res.headers['content-encoding'] === 'cx') {
const jsonStr = decompressToString(res.data as ArrayBuffer)
return JSON.parse(jsonStr)
}
const text = new TextDecoder().decode(new Uint8Array(res.data as ArrayBuffer))
return JSON.parse(text)
}This decoder expands the literal/match stream and extended matches. It is intentionally small and latency‑oriented. For advanced post‑processing stages (e.g., deep token interning or UTF‑8 normalization), WASM support may be introduced in future versions; today, fast delivers the best client‑side experience.
API Reference
compress(input: Buffer): Buffer— basic compression, streaming‑compatibledecompress(input: Buffer): Buffer— basic decompressioncompressFast(input: Buffer): Buffer— speed‑optimized encodercompressAdvanced(input: Buffer): Buffer— ratio‑optimized encoderdecompressAdvanced(input: Buffer): Buffer— reverses advanced transforms server‑sidecreateComprexiaMiddleware(opts?: { level?: 'fast' | 'advanced' })— plug‑and‑play Express middlewarecreateCompressorStream()— Node stream for chunked responsesnegotiateEncoding(acceptEncodingHeader?: string): 'cx' | 'identity'— checks client supportcomprexia/web/decoder— browser decoder export
Architecture Overview
- Encoder: LZ‑style match finder keyed by 4‑byte hash, with single‑position lookup in
fastand longer extension inadvanced. Extended match blocks (0xFF) reduce header costs for very long repeats. - Decoder: deterministic replay of literal blocks and back‑references. The browser module mirrors the native decoder’s logic, excluding server‑side JSON/UTF‑8 post‑processing used by
advanced. - Preprocessor: for
advancedmode, common JSON tokens/keys can be interned to compact values, and simple transforms stabilize repeated structures. This increases match quality and compressibility. Server‑sidedecompressAdvancedreverses these transforms. - Node bindings: N‑API addon compiled in Release mode for production throughput.
Performance Characteristics
- Large JSON (3k users, benchmark snapshot):
- Comprexia‑Fast: ratio ≈ 0.080, compress ≈ 1206 MB/s
- Comprexia‑Adv: ratio ≈ 0.054, compress ≈ 150 MB/s
- Gzip: ratio ≈ 0.028, compress ≈ 355 MB/s
- Brotli: ratio ≈ 0.008, compress ≈ 250 MB/s
- Zstd: ratio ≈ 0.010, compress ≈ 153 MB/s
- Nested JSON: ratio differences reflect structural complexity; fast favors speed, advanced yields parity with its own settings but may trail Brotli/Zstd in minimum sizes.
- Text 256KB: Comprexia provides respectable speed; general‑purpose codecs can produce very small outputs due to entropy coding.
Interpretation: Choose fast for latency‑critical JSON APIs where middle‑of‑the‑road ratios are acceptable and speed consistency matters. Choose advanced when you control both ends and absolute size becomes more important.
Operational Guidance
- Negotiation: Rely on
Accept-Encoding. Ifcxisn’t present, fall back to standardres.jsonor existing gzip middleware. This ensures broad compatibility and incremental rollouts. - Telemetry: Expose compression headers to monitor effectiveness. Track ratio and absolute bytes to inform level adjustments.
- Compatibility: Maintain gzip/Brotli for third‑party clients. Comprexia serves as an opt‑in compression for capable clients.
- Streaming: Prefer
createCompressorStream()for event streams and long polls; it integrates naturally and saves intermediate allocations.
When To Use Comprexia
- High‑traffic APIs with JSON payloads where latency and throughput are paramount
- Real‑time systems (chat, telemetry) that benefit from streaming compression
- Internal microservices using stable schemas, where structure‑aware transforms improve matches
- Mixed environments where you can negotiate
cxwith your own clients and keep gzip for others
When To Prefer gzip/Brotli/Zstd
- Static assets and text blobs where the smallest possible size is the primary goal
- Public internet clients where a custom codec negotiation is impractical
- Infrastructure locked to specific codecs by policy or platform constraints
Migration Plan
- Phase 1: Add middleware in
fastmode. Ship clients withAccept-Encoding: cx. Confirm headers reflect expected ratios. - Phase 2: Move selected endpoints to
advancedwhere payloads are large and controlled. Keep browser decoding tofastunless WASM is introduced. - Phase 3: Consider dictionary seeding and additional transforms if your schemas are highly repetitive.
Troubleshooting
- Build failures: Ensure C++20 compiler and CMake present;
npm run build:releasecompiles the addon. - Module not found: Run
npm run buildto produce TypeScript outputs. - Unexpected ratios: Review benchmarks on your real payloads; tune
fastvsadvancedper route. - Browser decode errors: Confirm
Content-Encoding: cxis set andresponseType: 'arraybuffer'is used. Use the exportedcomprexia/web/decoderhelpers.
Benchmarks
Run:
npm run benchmark:depthThe benchmark suite compares Comprexia against gzip/Brotli/Zstd across JSON and text datasets. Use it to calibrate level choices and validate improvements.
Build From Source
npm install
npm run build # TypeScript
npm run build:release # Native addon (Release)
npm test # Roundtrip testsRequires Node 18+, a C++20 compiler, and CMake.
Security Notes
- Do not compress sensitive payloads without transport security (TLS). Compression can amplify side channels if combined with attacker‑controlled inputs; treat compression as an optimization, not a security boundary.
- Avoid logging compressed binary payloads directly; prefer telemetry headers for observability.
Roadmap
- WASM decoder for advanced transforms to enable full client‑side decode without server involvement
- Adaptive dictionary options tailored to your API schemas
- Advanced entropy stage for headers and distance/length coding to close ratio gaps with Brotli/Zstd
- CLI tooling for batch compression of datasets and logs
License
MIT
