@llm-dev-ops/sentinel-cli
v0.1.2
Published
Enterprise-grade anomaly detection and observability platform for LLM applications - NPM CLI wrapper
Downloads
298
Maintainers
Readme
LLM-Sentinel
Enterprise-Grade Anomaly Detection and Observability for LLM Applications
LLM-Sentinel is a production-ready, real-time anomaly detection and observability platform designed specifically for Large Language Model (LLM) applications. Built in Rust for maximum performance and reliability, it provides comprehensive monitoring, statistical anomaly detection, automated alerting, and deep observability to ensure the reliability, security, and cost-effectiveness of your LLM deployments.
Status: ✅ Production Ready - Full implementation with enterprise-grade deployment infrastructure
Table of Contents
- Key Features
- Detection Capabilities
- Deployment Options
- Quick Start
- Architecture
- Observability
- API Reference
- Example Producers
- Configuration
- Performance
- Contributing
Key Features
🔍 Advanced Multi-Algorithm Anomaly Detection
- Z-Score Detection: Parametric anomaly detection for normally distributed metrics with configurable thresholds (default: 3.0σ)
- IQR Detection: Non-parametric outlier detection using interquartile range (default: 1.5x multiplier)
- MAD Detection: Robust outlier detection using median absolute deviation (default: 3.5 threshold)
- CUSUM Detection: Cumulative sum change point detection for drift and regime shifts (default: 5.0 threshold, 0.5 drift)
- Multi-Dimensional Baselines: Per-service, per-model statistical baselines with automatic updates
- Configurable Sensitivity: Tune detection sensitivity for your specific use cases
📊 Comprehensive Monitoring
Monitor all critical LLM metrics:
- Latency Spikes: Detect unusual response times (P50, P95, P99)
- Token Usage Anomalies: Monitor prompt and completion token consumption patterns
- Cost Anomalies: Track unexpected spending patterns and budget overruns
- Error Rate Spikes: Identify service degradation and failures
- Model Drift: Detect quality degradation over time
- Usage Patterns: Identify suspicious or abnormal usage behavior
- Throughput Changes: Monitor request rate variations
🚀 High Performance Architecture
- 10,000+ events/second ingestion throughput
- Sub-5ms P50 latency for anomaly detection
- Lock-free concurrent baseline updates using DashMap
- Batch processing for optimal InfluxDB writes (100-record batches)
- Zero-copy parsing for maximum efficiency
- Async/await throughout for non-blocking I/O
- Memory-efficient streaming processing
🛡️ Production-Grade Reliability
- Zero unsafe code: Memory safety guaranteed by Rust's type system
- Comprehensive error handling: Type-safe Result propagation with detailed error context
- Graceful shutdown: Proper signal handling (SIGTERM, SIGINT) with resource cleanup
- Health checks: Liveness, readiness, and startup probes for Kubernetes
- Circuit breakers: Automatic failure detection and recovery
- Exponential backoff: Intelligent retry logic for transient failures
- Connection pooling: Efficient resource management
🔔 Flexible Alerting System
- RabbitMQ Integration: Topic-based routing with severity levels (info, warning, critical)
- Webhook Delivery: HTTP POST with HMAC-SHA256 signatures for verification
- Alert Deduplication: Configurable 5-minute window to prevent alert storms
- Retry Logic: Exponential backoff with configurable max attempts (default: 3)
- Priority Routing: Route critical alerts to different channels
- Batch Alerting: Optional batching for high-volume scenarios
💾 Scalable Storage & Caching
- InfluxDB v3: Time-series storage for telemetry with automatic downsampling
- Moka Cache: High-performance in-memory cache (10,000 entry capacity)
- Redis Support: Distributed caching for multi-instance deployments
- Persistent Baselines: Baseline persistence to disk for quick restarts
- Query API: REST endpoints for historical data retrieval and analysis
- TTL Management: Automatic expiration of stale data (300s default)
📈 Rich Observability
- 50+ Prometheus Metrics: Comprehensive instrumentation of all subsystems
- 4 Pre-built Grafana Dashboards:
- Anomaly Detection Dashboard
- System Health Dashboard
- Performance Metrics Dashboard
- Alert Overview Dashboard
- 50+ Alert Rules: Production-ready Prometheus alerting covering all failure modes
- Distributed Tracing: OpenTelemetry support for request tracing
- Structured Logging: JSON logs with configurable levels (trace, debug, info, warn, error)
☸️ Cloud-Native Deployment
- Kubernetes-Ready: Production manifests with HPA, PDB, and network policies
- Helm Chart: Parameterized chart for easy deployment and upgrades
- Docker Images: Multi-stage builds with minimal attack surface (<50MB)
- Horizontal Scaling: Support for 3-10+ replicas with auto-scaling
- StatefulSet Support: Optional for baseline persistence
- Service Mesh Compatible: Works with Istio, Linkerd, Consul Connect
🔒 Security & Compliance
- Non-Root Containers: Runs as UID 1000 with dropped capabilities
- Read-Only Filesystem: Root filesystem mounted read-only
- Network Policies: Restrict ingress/egress to required services only
- Secret Management: Support for Kubernetes secrets and external secret stores
- PII Sanitization: Automatic detection and removal of sensitive data
- Audit Logging: Complete audit trail of all anomalies and alerts
- SBOM Generation: Software Bill of Materials for vulnerability tracking
Detection Capabilities
Real-Time Anomaly Detection
LLM-Sentinel provides four complementary detection algorithms that can be enabled individually or in combination:
1. Z-Score Detection (Parametric)
Best for: Metrics that follow a normal distribution
detection:
zscore:
threshold: 3.0 # Standard deviations from mean
sensitivity: "medium" # low, medium, high
metrics:
- latency_ms
- total_tokens
- cost_usdUse Cases:
- Latency spike detection
- Token usage monitoring
- Cost anomaly detection
2. IQR Detection (Non-Parametric)
Best for: Metrics with skewed distributions or outliers
detection:
iqr:
multiplier: 1.5 # IQR multiplier for outliers
metrics:
- latency_ms
- total_tokensUse Cases:
- Robust outlier detection
- Handling non-normal distributions
- Resistant to extreme values
3. MAD Detection (Robust)
Best for: Metrics requiring high robustness to outliers
detection:
mad:
threshold: 3.5 # MAD threshold
metrics:
- latency_msUse Cases:
- Ultra-robust detection
- Minimal false positives
- Gradual baseline shifts
4. CUSUM Detection (Change Point)
Best for: Detecting sustained changes and drift
detection:
cusum:
threshold: 5.0 # Cumulative threshold
drift: 0.5 # Drift parameter
metrics:
- latency_msUse Cases:
- Model performance degradation
- Service quality drift
- Gradual system changes
Baseline Management
- Adaptive Baselines: Automatic baseline updates every 60 seconds
- Multi-Dimensional: Separate baselines per service, model, and metric
- Configurable Window: 1000-sample sliding window (configurable)
- Minimum Samples: Require 10+ samples before detection (prevents cold-start false positives)
- Persistence: Save/load baselines from disk for fast restarts
Deployment Options
Option 1: Docker Compose (Development)
Perfect for local development and testing:
# Start full environment (Kafka, InfluxDB, RabbitMQ, Redis, Prometheus, Grafana)
docker-compose up -d
# View logs
docker-compose logs -f sentinel
# Stop environment
docker-compose downIncludes:
- Sentinel (3 replicas)
- Kafka cluster (3 brokers)
- InfluxDB v3
- RabbitMQ with management UI
- Redis
- Prometheus
- Grafana with pre-loaded dashboards
Option 2: Kubernetes (Production)
Production-grade deployment with all manifests:
# Apply all manifests
kubectl apply -f k8s/
# Or using kustomize
kubectl apply -k k8s/
# Check status
kubectl get pods -l app.kubernetes.io/name=sentinelIncludes:
- Deployment with 3 replicas
- HorizontalPodAutoscaler (3-10 replicas, CPU/memory targets)
- PodDisruptionBudget (min 2 available)
- NetworkPolicy (restricted ingress/egress)
- ServiceMonitor (Prometheus integration)
- Ingress with TLS
Option 3: Helm Chart (Recommended)
Easiest production deployment with full parameterization:
# Install with default values
helm install sentinel ./helm/sentinel \
--set secrets.influxdbToken="your-token" \
--set secrets.rabbitmqPassword="your-password"
# Install with custom values
helm install sentinel ./helm/sentinel -f production-values.yaml
# Upgrade
helm upgrade sentinel ./helm/sentinel
# Uninstall
helm uninstall sentinelHelm Chart Features:
- Full configuration parameterization (335+ options)
- External secrets support (AWS Secrets Manager, Vault, etc.)
- Multiple environment profiles (dev, staging, prod)
- Init containers for dependency checking
- Automatic ConfigMap/Secret generation
- Post-install notes with helpful commands
See Helm Chart README for complete documentation.
Option 4: Binary (Bare Metal)
Direct installation on Linux servers:
# Build release binary
cargo build --release
# Install
sudo cp target/release/sentinel /usr/local/bin/
# Create config
sudo mkdir -p /etc/sentinel
sudo cp config/sentinel.yaml /etc/sentinel/
# Create systemd service
sudo cp deployments/systemd/sentinel.service /etc/systemd/system/
sudo systemctl enable sentinel
sudo systemctl start sentinelQuick Start
Prerequisites
- Rust 1.75+ (for building from source)
- Kafka 2.8+ (message queue)
- InfluxDB v3 (time-series database)
- RabbitMQ 3.8+ (optional, for alerting)
- Redis 6.0+ (optional, for distributed caching)
Local Development Setup
# Clone repository
git clone https://github.com/llm-devops/llm-sentinel.git
cd llm-sentinel
# Start infrastructure
docker-compose up -d kafka influxdb rabbitmq redis
# Build project
cargo build --release
# Run tests
cargo test --workspace
# Start service
export INFLUXDB_TOKEN="your-token"
export RABBITMQ_PASSWORD="your-password"
./target/release/sentinel --config config/sentinel.yamlKubernetes Quick Start
# Create namespace
kubectl create namespace sentinel
# Create secrets
kubectl create secret generic sentinel-secrets \
--from-literal=influxdb-token="your-token" \
--from-literal=rabbitmq-password="your-password" \
-n sentinel
# Deploy with Helm
helm install sentinel ./helm/sentinel \
--namespace sentinel \
--set secrets.influxdbToken="your-token" \
--set secrets.rabbitmqPassword="your-password"
# Verify deployment
kubectl get pods -n sentinel
kubectl logs -f deployment/sentinel -n sentinel
# Access dashboards
kubectl port-forward svc/sentinel 8080:8080 -n sentinel
open http://localhost:8080/metricsVerify Installation
# Check health
curl http://localhost:8080/health/live
curl http://localhost:8080/health/ready
# View metrics
curl http://localhost:8080/metrics
# Query recent anomalies
curl http://localhost:8080/api/v1/anomalies/recent?limit=10
# Query telemetry
curl "http://localhost:8080/api/v1/telemetry?service=chat-api&hours=1"Architecture
High-Level Architecture
┌────────────────────────────────────────────────────────────────────┐
│ LLM Applications │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ OpenAI │ │ Claude │ │ Llama │ │ Custom │ │
│ │ API │ │ API │ │ API │ │ LLM API │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
└───────┼─────────────┼─────────────┼──────────────┼────────────────┘
│ │ │ │
└─────────────┴─────────────┴──────────────┘
│
▼ (Telemetry Events)
┌─────────────────────────────┐
│ Apache Kafka │
│ (llm.telemetry topic) │
└─────────────┬───────────────┘
│
▼
┌────────────────────────────────────────────────────────────────────┐
│ LLM-SENTINEL │
├────────────────────────────────────────────────────────────────────┤
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Ingestion Layer (sentinel-ingestion) │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌───────────────┐ │ │
│ │ │ Kafka │───>│ OTLP/JSON │───>│ Validation │ │ │
│ │ │ Consumer │ │ Parsing │ │ & PII Filter │ │ │
│ │ └──────────────┘ └──────────────┘ └───────┬───────┘ │ │
│ └────────────────────────────────────────────────────┼─────────┘ │
│ │ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Detection Engine (sentinel-detection) │ │
│ │ ┌──────────────────────────────────────────────────────┐ │ │
│ │ │ Baseline Manager (Multi-Dimensional) │ │ │
│ │ │ Per Service × Model × Metric Statistical Baselines │ │ │
│ │ └─────────────────────┬────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌──────────┐│ │
│ │ │ Z-Score │ │ IQR │ │ MAD │ │ CUSUM ││ │
│ │ │ Detector │ │ Detector │ │ Detector │ │ Detector ││ │
│ │ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ └────┬─────┘│ │
│ │ └──────────────┴───────────────┴──────────────┘ │ │
│ │ │ (Anomalies) │ │
│ └────────────────────────┼─────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────┼─────────────────────────────────────┐ │
│ │ ▼ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ InfluxDB │ │ RabbitMQ │ │ Webhook │ │ │
│ │ │ Storage │ │ Alerting │ │ Delivery │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ • Telemetry │ │ • Topic │ │ • HMAC │ │ │
│ │ │ • Anomalies │ │ Routing │ │ Signature │ │ │
│ │ │ • Query API │ │ • Severity │ │ • Retry │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ │ (sentinel-storage) (sentinel-alerting) │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ REST API (sentinel-api) │ │
│ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │
│ │ │ Health │ │ Metrics │ │ Query │ │ │
│ │ │ /health/* │ │ /metrics │ │ /api/v1/* │ │ │
│ │ └────────────┘ └────────────┘ └────────────┘ │ │
│ └──────────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────┘
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ Prometheus │ │ Grafana │
│ (Metrics) │ │ (Dashboards) │
└──────────────────┘ └──────────────────┘Component Architecture
sentinel-core
- Error types and handling
- Configuration models
- Telemetry event types
- Shared utilities
sentinel-ingestion
- Kafka consumer with group management
- OTLP/JSON parsing
- Schema validation
- PII detection and sanitization
- Configurable message handling
sentinel-detection
- Statistical baseline management
- Four detection algorithms (Z-Score, IQR, MAD, CUSUM)
- Multi-dimensional baseline tracking
- Lock-free concurrent updates
- Baseline persistence
sentinel-storage
- InfluxDB v3 client with batch writes
- In-memory cache (Moka)
- Redis distributed cache
- Query API for historical data
- Automatic TTL management
sentinel-alerting
- RabbitMQ topic publisher
- Webhook HTTP delivery
- Alert deduplication (5-minute window)
- Exponential backoff retry
- HMAC signature generation
sentinel-api
- REST API server (Axum)
- Health check endpoints
- Prometheus metrics exporter
- Query endpoints for telemetry and anomalies
- CORS support
Observability
Grafana Dashboards
Four production-ready dashboards included in deployments/grafana/dashboards/:
1. Anomaly Detection Dashboard
- Anomaly detection rate by severity (timeseries)
- Total anomaly rate gauge
- Anomalies in last 24h (stat)
- Anomaly types distribution (bar chart)
- Top services by anomalies
- Detection latency percentiles (P50, P95, P99)
2. System Health Dashboard
- Service status (up/down)
- Active instance count
- Uptime tracking
- Memory usage by instance
- CPU usage by instance
- Event processing throughput
- Error rates by component
3. Performance Metrics Dashboard
- Events processed per second
- Detection P95 latency
- Cache hit rate percentage
- Pipeline throughput (Kafka → Detection → Storage)
- Storage write rate
- API request rate
4. Alert Overview Dashboard
- Alerts sent (last hour)
- Alerts deduplicated (last hour)
- Alert failures (last hour)
- RabbitMQ publishes by severity
- Webhook delivery success rate
- Alert distribution by type
Prometheus Metrics
50+ metrics exported at /metrics:
Ingestion Metrics:
sentinel_events_received_total- Total events consumed from Kafkasentinel_events_processed_total- Successfully processed eventssentinel_ingestion_errors_total- Ingestion errors by typesentinel_validation_failures_total- Validation failuressentinel_kafka_messages_consumed_total- Kafka messages consumedsentinel_kafka_consumption_errors_total- Kafka errors
Detection Metrics:
sentinel_anomalies_detected_total- Anomalies by severity and detectorsentinel_detection_latency_seconds- Detection latency histogramsentinel_detection_errors_total- Detection errorssentinel_baseline_updates_total- Baseline update countsentinel_baseline_samples- Current baseline sample counts
Storage Metrics:
sentinel_storage_writes_total- Successful storage writessentinel_storage_errors_total- Storage errorssentinel_cache_hits_total- Cache hitssentinel_cache_misses_total- Cache missessentinel_cache_size- Current cache size
Alerting Metrics:
sentinel_alerts_sent_total- Alerts sent by channelsentinel_alerts_deduplicated_total- Deduplicated alertssentinel_alert_failures_total- Alert delivery failuressentinel_rabbitmq_publishes_total- RabbitMQ publishessentinel_webhook_deliveries_total- Webhook deliveriessentinel_webhook_failures_total- Webhook failures
Prometheus Alerts
50+ production-ready alert rules in deployments/prometheus/alerts/sentinel-alerts.yaml:
Service Health:
- SentinelServiceDown (1m down)
- SentinelHighRestartRate (frequent restarts)
- SentinelInsufficientInstances (<2 instances)
Performance:
- SentinelHighLatency (P95 >50ms)
- SentinelVeryHighLatency (P95 >100ms)
- SentinelLowThroughput (<100 events/sec)
Errors:
- SentinelHighErrorRate (>10 errors/sec)
- SentinelIngestionErrors (>5 errors/sec)
- SentinelDetectionErrors (>5 errors/sec)
- SentinelStorageErrors (>5 errors/sec)
Resources:
- SentinelHighMemoryUsage (>1.5GB)
- SentinelCriticalMemoryUsage (>1.8GB)
- SentinelHighCPUUsage (>80%)
Anomalies:
- SentinelAnomalySpike (>50 anomalies/sec)
- SentinelCriticalAnomalies (>5 critical/sec)
- SentinelNoAnomalies (0 anomalies for 6h - detection health check)
API Reference
Health Endpoints
Liveness Probe
GET /health/live
Response: 200 OK
{
"status": "healthy",
"timestamp": "2024-11-06T10:30:00Z"
}Readiness Probe
GET /health/ready
Response: 200 OK (when ready to accept traffic)
Response: 503 Service Unavailable (when not ready)Metrics Endpoint
Prometheus Metrics
GET /metrics
Response: 200 OK
# HELP sentinel_events_processed_total Total events processed
# TYPE sentinel_events_processed_total counter
sentinel_events_processed_total{service="chat-api"} 12453
...Query Endpoints
Query Recent Telemetry
GET /api/v1/telemetry?service={service}&model={model}&hours={hours}
Example:
GET /api/v1/telemetry?service=chat-api&model=gpt-4&hours=24
Response: 200 OK
{
"data": [...],
"count": 1523,
"timeRange": {
"start": "2024-11-05T10:30:00Z",
"end": "2024-11-06T10:30:00Z"
}
}Query Anomalies
GET /api/v1/anomalies?severity={severity}&hours={hours}&limit={limit}
Example:
GET /api/v1/anomalies?severity=critical&hours=1&limit=50
Response: 200 OK
{
"anomalies": [
{
"id": "anom-123",
"timestamp": "2024-11-06T10:25:00Z",
"service": "chat-api",
"model": "gpt-4",
"detector": "zscore",
"metric": "latency_ms",
"value": 15234.5,
"baseline_mean": 1234.5,
"baseline_stddev": 234.2,
"z_score": 59.8,
"severity": "critical"
}
],
"count": 3
}Query Recent Anomalies
GET /api/v1/anomalies/recent?limit={limit}
Example:
GET /api/v1/anomalies/recent?limit=10
Response: 200 OK (last 10 anomalies)Query Baseline Statistics
GET /api/v1/baselines?service={service}&model={model}
Response: 200 OK
{
"baselines": [
{
"key": "chat-api:gpt-4:latency_ms",
"samples": 1000,
"mean": 1234.5,
"stddev": 234.2,
"median": 1205.3,
"p95": 1687.2,
"p99": 2103.4
}
]
}Example Producers
Python Producer
Full-featured Python example for sending telemetry to Kafka:
cd examples/python
# Install dependencies
pip install -r requirements.txt
# Run with defaults (20 normal + 5 anomalous events)
python producer.py --brokers localhost:9092
# Run continuously
python producer.py --continuous
# Custom configuration
python producer.py \
--brokers kafka-0:9092,kafka-1:9092 \
--topic llm.telemetry \
--normal-events 50 \
--anomalous-events 10Features:
- Configurable event generation
- Simulates 4 anomaly types (high latency, high tokens, high cost, suspicious patterns)
- Kafka integration with guaranteed delivery
- Continuous mode for load testing
Go Producer
High-performance Go example for production use:
cd examples/go
# Build
go build -o producer producer.go
# Run with defaults
./producer -brokers localhost:9092
# Run continuously
./producer -continuous
# Custom configuration
./producer \
-brokers kafka-0:9092,kafka-1:9092 \
-topic llm.telemetry \
-normal-events 100 \
-anomalous-events 20Features:
- Native Go performance (10,000+ events/sec)
- Graceful shutdown (SIGINT/SIGTERM)
- Connection pooling and retries
- Minimal memory footprint (<20MB)
Integration Example
Integrate telemetry into your LLM application:
from llm_sentinel import TelemetryProducer
# Initialize
producer = TelemetryProducer(
brokers=["kafka:9092"],
topic="llm.telemetry"
)
# After each LLM API call
event = producer.create_telemetry_event(
service_name="my-chatbot",
model_name="gpt-4",
latency_ms=response_time,
prompt_tokens=completion.usage.prompt_tokens,
completion_tokens=completion.usage.completion_tokens,
cost_usd=calculated_cost,
user_id=user.id,
session_id=session.id
)
producer.send_event(event)Configuration
Complete Configuration Example
# Ingestion configuration
ingestion:
kafka:
brokers:
- "kafka-0:9092"
- "kafka-1:9092"
- "kafka-2:9092"
topic: "llm.telemetry"
group_id: "sentinel-consumer"
session_timeout_ms: 6000
enable_auto_commit: false
auto_offset_reset: "latest"
max_poll_records: 500
parsing:
max_text_length: 10000
enable_sanitization: true
sanitize_patterns:
- "password"
- "api_key"
- "secret"
validation:
min_latency_ms: 0.1
max_latency_ms: 300000.0 # 5 minutes
max_tokens: 100000
max_cost_usd: 100.0
enable_pii_detection: true
# Detection configuration
detection:
enabled_detectors:
- "zscore"
- "iqr"
- "mad"
- "cusum"
baseline:
window_size: 1000
min_samples: 10
update_interval_secs: 60
enable_persistence: true
persistence_path: "/var/lib/sentinel/baselines"
zscore:
threshold: 3.0
sensitivity: "medium" # low, medium, high
metrics:
- "latency_ms"
- "total_tokens"
- "cost_usd"
iqr:
multiplier: 1.5
metrics:
- "latency_ms"
- "total_tokens"
mad:
threshold: 3.5
metrics:
- "latency_ms"
cusum:
threshold: 5.0
drift: 0.5
metrics:
- "latency_ms"
# Storage configuration
storage:
influxdb:
url: "http://influxdb:8086"
org: "sentinel"
token: "${INFLUXDB_TOKEN}"
telemetry_bucket: "telemetry"
anomaly_bucket: "anomalies"
batch_size: 100
timeout_secs: 10
cache:
max_capacity: 10000
ttl_secs: 300
tti_secs: 60
enable_metrics: true
redis:
enabled: true
url: "redis://redis:6379"
password: "${REDIS_PASSWORD}"
key_prefix: "sentinel:"
ttl_secs: 300
# Alerting configuration
alerting:
rabbitmq:
url: "amqp://rabbitmq:5672"
username: "sentinel"
password: "${RABBITMQ_PASSWORD}"
exchange: "sentinel.alerts"
exchange_type: "topic"
routing_key_prefix: "alert"
persistent: true
timeout_secs: 10
retry_config:
max_attempts: 3
initial_delay_ms: 1000
backoff_multiplier: 2.0
max_delay_ms: 30000
webhook:
enabled: false
url: "https://alerts.example.com/webhook"
method: "POST"
secret: "${WEBHOOK_SECRET}"
timeout_secs: 10
max_retries: 3
retry_delay_ms: 1000
backoff_multiplier: 2.0
deduplication:
enabled: true
window_secs: 300
cleanup_interval_secs: 60
# API configuration
api:
bind_addr: "0.0.0.0:8080"
enable_cors: true
cors_origins:
- "*"
timeout_secs: 30
max_body_size: 10485760 # 10MB
enable_logging: true
metrics_path: "/metrics"See config/sentinel.yaml for a complete annotated example.
Environment Variables
All sensitive configuration can be provided via environment variables:
export INFLUXDB_TOKEN="your-influxdb-token"
export RABBITMQ_PASSWORD="your-rabbitmq-password"
export REDIS_PASSWORD="your-redis-password"
export WEBHOOK_SECRET="your-webhook-secret"
export SENTINEL_LOG_LEVEL="info"
export SENTINEL_LOG_JSON="true"
export RUST_BACKTRACE="1"Performance
Throughput Benchmarks
Tested on AWS m5.xlarge (4 vCPU, 16GB RAM):
| Operation | Throughput | P50 | P95 | P99 | |-----------|------------|-----|-----|-----| | Ingestion | 10,500 events/sec | 2ms | 8ms | 15ms | | Detection | 5,200 detections/sec | 3ms | 12ms | 25ms | | Storage (batched) | 8,300 writes/sec | 5ms | 18ms | 35ms | | API Queries | 2,800 req/sec | 8ms | 25ms | 48ms |
Resource Usage
| Configuration | Memory | CPU | Disk I/O | |---------------|--------|-----|----------| | Idle | 180MB | 0.1 cores | 1 MB/s | | Light load (1k events/sec) | 320MB | 0.8 cores | 15 MB/s | | Medium load (5k events/sec) | 520MB | 1.8 cores | 42 MB/s | | Heavy load (10k events/sec) | 850MB | 2.9 cores | 78 MB/s |
Scaling Characteristics
- Horizontal: Linear scaling up to 10 replicas
- Vertical: Efficient use of multi-core (up to 8 cores)
- Storage: InfluxDB handles 100k+ writes/sec with proper sizing
- Network: ~10 Mbps at 10k events/sec (depends on event size)
Development
Project Structure
llm-sentinel/
├── crates/
│ ├── sentinel-core/ # Core types and error handling (1,350 lines)
│ ├── sentinel-ingestion/ # Kafka consumer and OTLP parsing (1,390 lines)
│ ├── sentinel-detection/ # Anomaly detection algorithms (2,319 lines)
│ ├── sentinel-storage/ # InfluxDB and caching (987 lines)
│ ├── sentinel-alerting/ # RabbitMQ and webhooks (1,645 lines)
│ └── sentinel-api/ # REST API server (1,452 lines)
├── sentinel/ # Main binary (285 lines)
├── config/ # Configuration examples
├── deployments/ # Deployment configurations
│ ├── grafana/ # 4 Grafana dashboards
│ └── prometheus/ # 50+ alert rules
├── docs/ # Complete documentation (27 files)
├── examples/ # Example producers
│ ├── python/ # Python producer example
│ └── go/ # Go producer example
├── helm/ # Helm chart
│ └── sentinel/ # Production-ready chart
├── k8s/ # Kubernetes manifests (13 files)
├── .github/workflows/ # CI/CD pipelines
├── Cargo.toml # Rust workspace
├── docker-compose.yaml # Local development environment
└── Dockerfile # Multi-stage Docker buildTotal: ~9,500 lines of production Rust code
Running Tests
# All tests
cargo test --workspace
# Specific crate
cargo test -p sentinel-detection
# With output
cargo test -- --nocapture
# Integration tests only
cargo test --workspace --test '*'
# With coverage
cargo tarpaulin --out Html --output-dir coverageBuilding
# Debug build
cargo build
# Release build (optimized)
cargo build --release
# With specific features
cargo build --features redis
# Docker build
docker build -t llm-sentinel:latest .
# Multi-platform Docker build
docker buildx build --platform linux/amd64,linux/arm64 -t llm-sentinel:latest .Code Quality
# Format code
cargo fmt
# Lint with clippy
cargo clippy -- -D warnings
# Security audit
cargo audit
# Check dependencies
cargo deny checkCI/CD Pipeline
Full GitHub Actions workflow:
CI Pipeline (.github/workflows/ci.yaml)
- Format check (rustfmt)
- Lint (clippy with all warnings as errors)
- Build (debug and release)
- Test (all workspaces)
- Coverage (tarpaulin with 80% target)
- Security audit (cargo-audit)
- Dependency check (cargo-deny)
CD Pipeline (.github/workflows/cd.yaml)
- Docker build (multi-stage)
- SBOM generation (syft)
- Image signing (cosign)
- Vulnerability scan (trivy)
- Push to registry (GHCR)
- Helm package (chart packaging)
- Deploy (to staging/production)
Contributing
We welcome contributions! Please see CONTRIBUTING.md for:
- Code of Conduct
- Development setup
- Coding standards
- Testing requirements
- Pull request process
- Security reporting
Documentation
Complete documentation available in /docs:
- Architecture Overview
- Detection Methods
- Deployment Guide
- Integration Examples
- Performance Benchmarks
- Configuration Reference
License
This project is licensed under the Apache License 2.0 - see LICENSE file for details.
Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: Complete docs
- Security: [email protected]
Roadmap
- [ ] Additional detection algorithms (Isolation Forest, AutoEncoder)
- [ ] Real-time dashboard (WebSocket streaming)
- [ ] Multi-tenant support
- [ ] Advanced query DSL
- [ ] Machine learning model drift detection
- [ ] Automated baseline tuning
- [ ] Support for additional message brokers (NATS, Pulsar)
- [ ] OpenTelemetry native ingestion
- [ ] Grafana plugin for custom visualizations
Status: ✅ Production Ready Version: 0.1.0 License: Apache 2.0 Built with: Rust 🦀 Last Updated: 2025-11-06
