@quicore/expressjs

v3.0.0

Published

7 days ago

Fast-start Express server with production-grade scaling primitives.

0High
0Medium
0Low

ageofai

carnivalofrust

darkknight

@quicore/expressjs

Fast-start Express server with sensible defaults, a power-user plugin system, and production-grade scaling primitives built in.

import { QuicoreExpressServer } from '@quicore/expressjs';
const server = new QuicoreExpressServer();
server.setRoutes([{ method: 'get', path: '/', handlers: (req, res) => res.send('ok') }]);
await server.start();

This still works exactly as before. The v3 additions are all opt-in.

What's new in v3

Production HTTP timeouts (on by default)

new QuicoreExpressServer({
  timeouts: {
    keepAlive: 65_000,   // > LB idle timeout (typical: 60s)
    headers:   66_000,   // must be > keepAlive (Node requirement)
    request:   30_000,   // total request time cap
  },
});

Defaults are tuned for environments behind AWS ALB / GCP LB / Cloudflare. The keepAlive > LB idle ordering matters: it prevents the LB from sending a request on a socket the origin is about to close (which surfaces as random 502s on the client side).

Load shedding (opt-in)

Three independent signals to reject excess work fast — protect the process instead of letting it collapse:

new QuicoreExpressServer({
  loadShed: {
    maxInFlight: 1000,           // 503 above this many concurrent requests
    maxEventLoopLagMs: 200,      // 503 when p99 event-loop lag > 200ms
    maxConnections: 10_000,      // TCP-level cap (server.maxConnections)
    retryAfterSeconds: 1,        // included in 503 response
  },
});

At 5–20k RPS you absolutely want at least maxInFlight set — without it, a slow downstream causes unbounded queue growth in your process.

Metrics exposed at server.metrics:

{
  inFlight: 234,
  maxInFlight: 1000,
  eventLoopLagMs: 12.5,
  maxEventLoopLagMs: 200,
  shed: { inFlight: 0, eventLoop: 0 },
}

/ready (via healthPlugin) reads these and returns 503 when overloaded, so the LB stops sending traffic before things break.

HTTP/2 (opt-in)

new QuicoreExpressServer({
  http:  { port: 8080, http2: true },   // h2c — plaintext HTTP/2
  https: { enabled: true, port: 8443, ssl: {...}, http2: true },  // h2 + h2 over TLS
});

allowHTTP1: true is set so the same server handles both HTTP/1.1 and HTTP/2 clients. All existing Express middleware works unchanged.

Use this when origin-terminating H2 (service mesh, gRPC-Web, server push). For edge-terminated deployments behind CloudFront/Cloudflare, leave it off — the edge handles H2/H3 and talks H1 to your origin.

Two-phase graceful shutdown

The shutdown sequence is now drain-then-close, which is critical in K8s/LB environments:

new QuicoreExpressServer({
  shutdown: {
    drainDelayMs:   5_000,    // pause so LB deregisters us
    gracePeriodMs: 30_000,    // then close, allow in-flight to finish
  },
});

Sequence on SIGTERM:

_draining = true, draining event fires, /ready starts returning 503
Wait drainDelayMs for LB to remove us from rotation
server.close() — refuse new connections
closeIdleConnections() (Node 18.2+) — drop idle keep-alive sockets
Wait up to gracePeriodMs for in-flight requests to complete
closeAllConnections() — destroy survivors
Exit

The infra triggers this (K8s preStop hook + SIGTERM); the class handles the choreography.

Streaming body strategies (the FHIR/HL7 fix)

The old captureRaw buffered every request body in V8 heap. At 10MB payloads × 100 concurrent requests = 1GB heap, OOM. v3 has three bounded alternatives:

import { rawBodyCapture, streamBodyToFile, streamBodyToHandler } from '@quicore/expressjs';

// 1. Bounded raw Buffer — for HMAC signature verification (small payloads)
server.setRoutes([{
  method: 'post',
  path: '/webhooks/stripe',
  handlers: [rawBodyCapture({ limit: '256kb' }), verifyAndHandleWebhook],
}]);

// 2. Stream large bodies to a temp file — for FHIR Bundles / HL7 batches
server.setRoutes([{
  method: 'post',
  path: '/fhir/Bundle',
  handlers: [streamBodyToFile({ limit: '500mb', dir: '/var/quicore/uploads' }), processBundle],
}]);
// In your handler: req.bodyFile = { path, size, contentType }
// File is auto-deleted on response finish.

// 3. Hand the raw stream to your handler — for direct forwarding to S3/Kafka
server.setRoutes([{
  method: 'post',
  path: '/hl7/batch',
  handlers: streamBodyToHandler(async (req, res) => {
    await pipeline(req, await createS3UploadStream(req));
    res.status(202).json({ received: true });
  }, { limit: '5gb' }),
}]);

All three:

Enforce hard size limits (413 on overflow, socket destroyed — no further bytes read)
Skip if a body parser already consumed the body upstream
Mount per-route (NOT globally) — the default JSON parser still handles your normal routes

captureRaw at the server level (bodyParser.json.captureRaw: true) is now off by default. It still works for backward compat, but the streaming middleware is the production answer.

What's deliberately NOT in scope

These are infra concerns, not server-class concerns:

Clustering / worker_threads — Use K8s replicas, PM2, or node --cluster. One process per container is the modern norm.
HTTP/3 / QUIC — Edge-terminate at Cloudflare/ALB. Node's QUIC support isn't production-ready.
Distributed rate limiting — Use a sidecar (Envoy) or external store (Redis). The framework offers per-instance limits via plugins.
TLS certificate rotation — Use cert-manager / certbot / your secrets manager. Restart the process to pick up new certs (K8s rolling restart handles this).
Metrics scraping — Expose server.metrics via the optional /metrics endpoint or wire it to your Prometheus client. Not bundled.

Recommended config for 5–20k RPS per instance

new QuicoreExpressServer({
  http: { port: 3000 },                           // edge-terminate TLS
  trustProxy: 1,                                  // honor X-Forwarded-* from one proxy hop

  timeouts: {
    keepAlive: 65_000,
    headers:   66_000,
    request:   30_000,
  },

  loadShed: {
    maxInFlight: 2000,                            // tune from observed p99 concurrency
    maxEventLoopLagMs: 200,
    maxConnections: 10_000,
  },

  bodyParser: {
    json: { limit: '1mb' },                       // captureRaw off; use streaming middleware for large payloads
  },

  shutdown: {
    drainDelayMs:  10_000,                        // K8s preStop sleep should match
    gracePeriodMs: 30_000,                        // K8s terminationGracePeriodSeconds should be > this
  },

  logLevel: 'info',
});

K8s deployment companions (rough):

spec:
  terminationGracePeriodSeconds: 45   # > drainDelayMs + gracePeriodMs
  containers:
    - name: app
      lifecycle:
        preStop:
          exec:
            command: ["sleep", "10"] # matches drainDelayMs
      readinessProbe:
        httpGet:  { path: /ready, port: 3000 }
        periodSeconds: 2
      livenessProbe:
        httpGet:  { path: /health, port: 3000 }
        periodSeconds: 10

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@quicore/expressjs

What's new in v3

Production HTTP timeouts (on by default)

Load shedding (opt-in)

HTTP/2 (opt-in)

Two-phase graceful shutdown

Streaming body strategies (the FHIR/HL7 fix)

What's deliberately NOT in scope

Recommended config for 5–20k RPS per instance