@techalmondsai/nodejs-monitoring
v1.1.0
Published
A comprehensive monitoring service for Node.js applications with built-in health probes and metrics collection
Downloads
182
Maintainers
Readme
Node.js Monitoring Service
A comprehensive, container-aware monitoring service for Node.js applications with built-in health probes, metrics collection, Kubernetes-native endpoints, and graceful shutdown support. A lightweight alternative to New Relic.
Features
- Easy Integration - One-line setup for Express applications
- Container-Aware - Reads cgroup v1/v2 for accurate memory and CPU metrics in Docker/Kubernetes
- Kubernetes-Native - Separate
/healthz(liveness) and/readyz(readiness) endpoints - Built-in Metrics - Memory, CPU, heap, uptime, request tracking
- Health Probes - Memory, CPU, response time, error rate, disk space, uptime
- Graceful Shutdown - SIGTERM handling, interval cleanup, shutdown-aware probes
- Performance Tracking - Request/response time monitoring (5xx errors only)
- Custom Probes - Add your own health checks with configurable timeouts
- REST API - Built-in endpoints for health, metrics, liveness, and readiness
Installation
npm install @techalmondsai/nodejs-monitoringQuick Start
Basic Setup
import express from "express";
import { setupMonitoring } from "@techalmondsai/nodejs-monitoring";
const app = express();
// Setup monitoring with one line
const monitoring = setupMonitoring(app);
// Your routes
app.get("/api/users", (req, res) => {
res.json({ users: [] });
});
app.listen(3000, () => {
console.log("Server running on port 3000");
console.log("Health check: http://localhost:3000/health");
console.log("Liveness: http://localhost:3000/healthz");
console.log("Readiness: http://localhost:3000/readyz");
});Advanced Configuration
import express from "express";
import { setupMonitoring } from "@techalmondsai/nodejs-monitoring";
const app = express();
const monitoring = setupMonitoring(app, {
healthRoutePath: "/api/health",
metricsInterval: 15000,
enableRequestTracking: true,
enableErrorTracking: true,
livenessPath: "/api/liveness",
readinessPath: "/api/readiness",
probeTimeout: 5000,
alertThresholds: {
memoryUsage: 85,
cpuUsage: 90,
responseTime: 2000,
errorRate: 5,
},
});
// Add custom health probe
monitoring.addProbe({
name: "database_connection",
check: async () => {
try {
await database.ping();
return {
status: "healthy",
message: "Database connection successful",
};
} catch (error) {
return {
status: "critical",
message: `Database connection failed: ${error.message}`,
};
}
},
interval: 30000,
});
app.listen(3000);API Endpoints
Once integrated, your application will have the following endpoints:
Health Check (Full Report)
GET /healthReturns comprehensive health information including all metrics and probe results. Returns 200 when healthy, 503 when any probe is critical or warning.
{
"status": "healthy",
"timestamp": 1640995200000,
"uptime": 3600,
"hostname": "my-pod-abc123",
"metrics": {
"timestamp": 1640995200000,
"uptime": 3600,
"memory": {
"used": 256,
"total": 2048,
"percentage": 12,
"heap": {
"used": 80,
"total": 1584,
"percentage": 5
},
"isContainerAware": true
},
"cpu": {
"usage": 15.5,
"loadAverage": [0.5, 0.3, 0.2],
"effectiveCpus": 0.5,
"isContainerAware": true
},
"process": {
"pid": 1,
"version": "v18.17.0",
"platform": "linux",
"arch": "x64",
"hostname": "my-pod-abc123"
},
"requests": {
"total": 150,
"active": 2,
"averageResponseTime": 120
},
"errors": {
"total": 5,
"rate": 0.2
}
},
"probes": {
"memory_usage": {
"status": "healthy",
"message": "Memory usage is 12%",
"value": 12
},
"cpu_usage": {
"status": "healthy",
"message": "CPU usage is 15.5%",
"value": 15.5
},
"response_time": {
"status": "healthy",
"message": "Avg response time is 120ms",
"value": 120
},
"error_rate": {
"status": "healthy",
"message": "Error rate is 0.2/min",
"value": 0.2
},
"uptime": {
"status": "healthy",
"message": "Application has been running for 1h 0m",
"value": 3600
},
"disk_space": {
"status": "healthy",
"message": "Disk usage is 45%",
"value": 45
}
},
"version": "1.0.0"
}Kubernetes Liveness Probe
GET /healthzReturns 200 if the process is alive and not shutting down. Does not check downstream dependencies — prevents unnecessary pod restarts.
{
"status": "alive",
"uptime": 3600,
"hostname": "my-pod-abc123"
}Kubernetes Readiness Probe
GET /readyzReturns 200 if the service is ready to accept traffic. Returns 503 only when a probe is critical — warnings are treated as still ready.
{
"status": "ready",
"hostname": "my-pod-abc123"
}Metrics History
GET /health/metrics?limit=50Returns historical metrics data (up to 100 entries, clamped):
{
"metrics": [],
"count": 50,
"latest": { }
}Kubernetes Configuration
livenessProbe:
httpGet:
path: /healthz
port: 3000
initialDelaySeconds: 10
periodSeconds: 15
failureThreshold: 3
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /readyz
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 3
timeoutSeconds: 3The paths are configurable via livenessPath and readinessPath in the config.
Built-in Probes
| Probe | Default Threshold | Description |
|-------|------------------|-------------|
| Memory Usage | 80% | Container-aware (cgroup v1/v2). Warning at 80% of threshold, critical above. |
| CPU Usage | 80% | Normalized against cgroup CPU quota. Warning at 80% of threshold, critical above. |
| Response Time | 5000ms | Average response time across last 100 requests. |
| Error Rate | 10/min | 5xx errors per minute (4xx excluded). Adjusts window for short uptimes. |
| Uptime | - | Application uptime. Always healthy. |
| Disk Space | 85% | Actual disk usage via statfsSync or async df. Warning at 80% of threshold. |
Container-Aware Metrics
On Kubernetes/Docker, the library automatically detects the container environment and reads real resource limits:
- Memory: Reads from
/sys/fs/cgroup/memory.max(v2) or/sys/fs/cgroup/memory/memory.limit_in_bytes(v1) instead ofos.totalmem() - CPU: Normalizes usage against cgroup CPU quota (
cpu.max/cpu.cfs_quota_us) instead of host CPU count - Heap: Uses
v8.getHeapStatistics().heap_size_limitfor the real V8 heap ceiling
On bare metal/EC2 without containers, it falls back to standard OS metrics.
Configuration Options
interface MonitoringConfig {
enableHealthRoute?: boolean; // Enable /health endpoint (default: true)
healthRoutePath?: string; // Health route path (default: '/health')
enableMetricsCollection?: boolean; // Enable metrics collection (default: true)
metricsInterval?: number; // Collection interval in ms (default: 30000)
enableRequestTracking?: boolean; // Track HTTP requests (default: true)
enableErrorTracking?: boolean; // Track 5xx errors (default: true)
customProbes?: CustomProbe[]; // Additional custom probes
alertThresholds?: AlertThresholds; // Custom alert thresholds
livenessPath?: string; // K8s liveness path (default: '/healthz')
readinessPath?: string; // K8s readiness path (default: '/readyz')
probeTimeout?: number; // Probe execution timeout in ms (default: 10000)
}
interface AlertThresholds {
memoryUsage?: number; // Memory usage threshold percentage
cpuUsage?: number; // CPU usage threshold percentage
responseTime?: number; // Response time threshold in ms
errorRate?: number; // Error rate threshold per minute
}Graceful Shutdown
The service automatically handles SIGTERM signals (sent by Kubernetes during pod termination):
- Sets
isShuttingDownflag — liveness and readiness probes immediately return503 - Clears all metric collection and probe execution intervals
- Cleans up system metrics and request tracker resources
You can also trigger shutdown manually:
const monitoring = MonitoringService.getInstance();
process.on("SIGTERM", async () => {
await monitoring.shutdown();
server.close();
});Custom Probes
import { CustomProbe, ProbeResult } from "@techalmondsai/nodejs-monitoring";
const customProbe: CustomProbe = {
name: "external_api_check",
check: async (): Promise<ProbeResult> => {
const start = Date.now();
try {
const response = await fetch("https://api.example.com/health");
return {
status: response.ok ? "healthy" : "critical",
message: `External API responded with ${response.status}`,
value: response.status,
metadata: {
responseTime: Date.now() - start,
url: "https://api.example.com/health",
},
};
} catch (error) {
return {
status: "critical",
message: `External API unreachable: ${error.message}`,
};
}
},
interval: 60000,
};
monitoring.addProbe(customProbe);Probes that hang beyond the configured probeTimeout (default 10s) are automatically marked as critical.
Integration Examples
Express.js with TypeScript
import express from "express";
import { setupMonitoring } from "@techalmondsai/nodejs-monitoring";
const app = express();
const monitoring = setupMonitoring(app, {
healthRoutePath: "/api/health",
alertThresholds: {
memoryUsage: 85,
cpuUsage: 90,
},
});
monitoring.addProbe({
name: "postgres_connection",
check: async () => {
try {
const client = await pool.connect();
await client.query("SELECT NOW()");
client.release();
return { status: "healthy", message: "PostgreSQL connected" };
} catch (error) {
return {
status: "critical",
message: `PostgreSQL error: ${error.message}`,
};
}
},
});
export default app;NestJS Integration
import { NestFactory } from "@nestjs/core";
import { AppModule } from "./app.module";
import { setupMonitoring } from "@techalmondsai/nodejs-monitoring";
async function bootstrap() {
const app = await NestFactory.create(AppModule);
const monitoring = setupMonitoring(app.getHttpAdapter().getInstance(), {
healthRoutePath: "/health",
metricsInterval: 20000,
});
await app.listen(3000);
}
bootstrap();Manual Usage (Without Express)
import { MonitoringService } from "@techalmondsai/nodejs-monitoring";
const monitoring = MonitoringService.getInstance({
enableHealthRoute: false,
metricsInterval: 15000,
});
// Get current metrics
const metrics = monitoring.getCurrentMetrics();
// Get probe results
const probes = monitoring.getProbeResults();
// Add custom probe
monitoring.addProbe({
name: "custom_check",
check: async () => {
return { status: "healthy", message: "All good!" };
},
});API Reference
MonitoringService
| Method | Description |
|--------|-------------|
| getInstance(config?) | Get or create the singleton instance |
| addProbe(probe) | Add a custom health probe |
| getCurrentMetrics() | Get current system metrics |
| getProbeResults() | Get all probe results |
| requestTrackingMiddleware() | Express middleware for request tracking |
| healthCheckHandler() | Express handler for /health |
| livenessHandler() | Express handler for /healthz |
| readinessHandler() | Express handler for /readyz |
| metricsHistoryHandler() | Express handler for /health/metrics |
| shutdown() | Graceful shutdown (clears intervals, sets shutting down flag) |
| static reset() | Reset all singletons (for testing) |
Helper Functions
| Function | Description |
|----------|-------------|
| setupMonitoring(app, config?) | Quick setup for Express apps — registers all endpoints and middleware |
Exports
| Export | Description |
|--------|-------------|
| MonitoringService | Core monitoring singleton |
| BuiltInProbes | Factory methods for built-in probes |
| ContainerDetector | Container/cgroup detection utility |
| HealthMetrics | TypeScript interface for metrics |
| MonitoringConfig | TypeScript interface for config |
| CustomProbe | TypeScript interface for custom probes |
| ProbeResult | TypeScript interface for probe results |
| AlertThresholds | TypeScript interface for thresholds |
Testing
npm test
npm run test:watch
npm test -- --coverageLicense
MIT License - see LICENSE file for details.
Support
- GitHub Issues: Report bugs and request features
- Documentation: Full API documentation
