@qcloudy/cdk-blackbox-monitoring
v0.7.5
Published
AWS CDK construct library for CloudWatch Synthetics canaries with simple HTTP/HTTPS endpoint monitoring
Maintainers
Readme
@qcloudy/cdk-blackbox-monitoring
AWS CDK construct library for CloudWatch Synthetics canaries with simple HTTP/HTTPS endpoint monitoring.
The construct creates CloudWatch Synthetics canaries, CloudWatch Alarms, IAM roles, and SNS-based notification routing for one or more HTTP/HTTPS endpoints, with validation and sensible defaults built in.
Installation
npm install @qcloudy/cdk-blackbox-monitoringPeer dependencies:
aws-cdk-lib(v2)constructs(v10)
Quick start
import * as cdk from 'aws-cdk-lib';
import { BlackboxMonitoring } from '@qcloudy/cdk-blackbox-monitoring';
const app = new cdk.App();
const stack = new cdk.Stack(app, 'MonitoringStack');
new BlackboxMonitoring(stack, 'Monitoring', {
endpoints: [
{
name: 'api',
url: 'https://api.example.com/health',
},
],
});Basic usage
Single endpoint with defaults
new BlackboxMonitoring(stack, 'Monitoring', {
endpoints: [
{
name: 'api',
url: 'https://api.example.com/health',
},
],
});Defaults:
- Schedule: every 5 minutes
- Timeout: 30 seconds
- HTTP method:
GET - Status validation: 200–399 is considered success
- Alarms:
- One failed canary alarm per endpoint
- Optional p95 duration alarm (disabled by default)
Multiple endpoints and global defaults
import { Duration } from 'aws-cdk-lib';
new BlackboxMonitoring(stack, 'Monitoring', {
defaultSchedule: Duration.minutes(2),
defaultTimeout: Duration.seconds(20),
defaultMethod: 'GET',
endpoints: [
{
name: 'public-api',
url: 'https://api.example.com/health',
},
{
name: 'website',
url: 'https://www.example.com',
// Uses global defaults for schedule/timeout/method
},
],
});Advanced usage
Custom headers and HTTP methods
new BlackboxMonitoring(stack, 'Monitoring', {
endpoints: [
{
name: 'head-check',
url: 'https://api.example.com/health',
method: 'HEAD',
headers: {
Authorization: 'Bearer token',
'X-Tenant': 'example',
},
},
],
});Status code validation
You can control which HTTP status codes are considered successful using allowedStatusCodes and allowedStatusRanges.
- Default: 200–399 is success.
- Explicit codes (
allowedStatusCodes) take highest precedence. - Ranges (
allowedStatusRanges) are used when explicit codes are not provided.
import { StatusRange } from '@qcloudy/cdk-blackbox-monitoring';
const successRanges: StatusRange[] = [
{ min: 200, max: 299 },
{ min: 301, max: 301 },
];
new BlackboxMonitoring(stack, 'Monitoring', {
endpoints: [
{
name: 'api',
url: 'https://api.example.com/health',
// Explicit list of allowed codes
allowedStatusCodes: [200, 201, 204],
},
{
name: 'redirect-api',
url: 'https://api.example.com/health',
// Accept ranges instead of specific codes
allowedStatusRanges: successRanges,
},
],
});Duration alarms
Enable a p95 duration alarm per endpoint:
new BlackboxMonitoring(stack, 'Monitoring', {
endpoints: [
{
name: 'latency-sensitive-endpoint',
url: 'https://api.example.com/health',
enableDurationAlarm: true,
durationP95ThresholdMs: 1000, // 1 second
},
],
});Notification routing
Notification routing is based on SNS topics and CloudWatch alarm actions.
Precedence (highest to lowest):
endpoint.alarmActionsendpoint.notificationTopicprops.alarmActionsprops.notificationTopic
import * as sns from 'aws-cdk-lib/aws-sns';
import * as cloudwatchActions from 'aws-cdk-lib/aws-cloudwatch-actions';
const globalTopic = new sns.Topic(stack, 'GlobalTopic');
const apiTopic = new sns.Topic(stack, 'ApiTopic');
new BlackboxMonitoring(stack, 'Monitoring', {
notificationTopic: globalTopic,
endpoints: [
{
name: 'default-routing',
url: 'https://api.example.com/health',
},
{
name: 'per-endpoint-topic',
url: 'https://api.example.com/payments/health',
notificationTopic: apiTopic,
},
{
name: 'custom-actions',
url: 'https://api.example.com/orders/health',
alarmActions: [new cloudwatchActions.SnsAction(apiTopic)],
},
],
});Extension points
Override canary properties
Use canaryPropsOverride to merge extra synthetics.CanaryProps into the created canaries. Endpoint-level overrides win over global overrides.
new BlackboxMonitoring(stack, 'Monitoring', {
canaryPropsOverride: {
startAfterCreation: true,
},
endpoints: [
{
name: 'api',
url: 'https://api.example.com/health',
canaryPropsOverride: {
startAfterCreation: false,
},
},
],
});Override alarm properties
new BlackboxMonitoring(stack, 'Monitoring', {
failedAlarmOverride: {
alarmDescription: 'Global failed canary alarm',
},
durationAlarmOverride: {
evaluationPeriods: 3,
},
endpoints: [
{
name: 'api',
url: 'https://api.example.com/health',
enableDurationAlarm: true,
durationP95ThresholdMs: 1000,
failedAlarmOverride: {
evaluationPeriods: 5,
},
},
],
});Custom canary code
If you need more complex behavior (multi-step flows, custom auth, etc.), you can provide your own canary code and handler:
import * as synthetics from 'aws-cdk-lib/aws-synthetics';
new BlackboxMonitoring(stack, 'Monitoring', {
endpoints: [
{
name: 'custom-flow',
url: 'https://api.example.com/health',
customCanaryCode: synthetics.Code.fromAsset('path/to/custom-canary'),
customHandler: 'index.handler',
},
],
});Resource outputs
All created resources are exposed via the endpoints map on the construct:
const monitoring = new BlackboxMonitoring(stack, 'Monitoring', {
endpoints: [
{
name: 'api',
url: 'https://api.example.com/health',
enableDurationAlarm: true,
durationP95ThresholdMs: 1000,
},
],
});
const apiResources = monitoring.endpoints.get('api');
if (apiResources) {
// Access the created resources
const canary = apiResources.canary;
const failedAlarm = apiResources.failedAlarm;
const durationAlarm = apiResources.durationAlarm;
const iamRole = apiResources.iamRole;
}Slack integration (via SNS + AWS Chatbot)
This library does not depend directly on Slack or AWS Chatbot, but you can integrate alarms with Slack by:
- Creating an SNS topic for notifications.
- Configuring an AWS Chatbot Slack channel configuration that subscribes to that SNS topic.
- Passing the SNS topic to
notificationTopic(global) orendpoint.notificationTopic.
Once configured, alarm notifications will appear in the configured Slack channel.
Simple Slack Integration (Lambda-based)
If you don't want to set up AWS Chatbot, you can provide a Slack Webhook URL directly to the construct. This will create a lightweight Node.js Lambda function that automatically formats and forwards alarm notifications to Slack.
new BlackboxMonitoring(stack, 'Monitoring', {
slackWebhookUrl: 'https://hooks.slack.com/services/...',
endpoints: [
{
name: 'api',
url: 'https://api.example.com/health',
},
],
});AI Chatbot & Complex Service Monitoring
For services that require more than a simple status check—like an AI Chatbot or a Bedrock-powered API—you can natively perform complex interactions (e.g., a "Ping-Pong" health check) and validate the response body without custom code.
Example: Chatbot Ping-Pong Check
This example shows how to monitor a chatbot by sending a "ping" message and verifying the response contains "pong".
new BlackboxMonitoring(stack, 'ChatbotMonitoring', {
endpoints: [
{
name: 'chatbot-health',
url: 'https://api.example.com/v1/chat',
schedule: Duration.minutes(60),
method: 'POST',
body: JSON.stringify({ message: 'ping' }),
expectedBodyContent: 'pong',
// Optionally notify via Slack if the "pong" is missing
slackWebhookUrl: 'https://hooks.slack.com/services/...',
},
],
});API reference (high level)
BlackboxMonitoringProps
- endpoints:
BlackboxEndpoint[](required) - defaultSchedule?:
Duration - defaultTimeout?:
Duration - defaultMethod?:
'GET' | 'HEAD' | 'POST' | 'PUT' | 'PATCH' - alarmDefaults?:
failedThreshold?:numberfailedEvaluationPeriods?:numberfailedDatapointsToAlarm?:numbertreatMissingData?:cloudwatch.TreatMissingData
- notificationTopic?:
sns.ITopic - slackWebhookUrl?:
string - alarmActions?:
cloudwatch.IAlarmAction[] - namePrefix?:
string - tags?:
{ [key: string]: string } - canaryPropsOverride?:
Partial<synthetics.CanaryProps> - failedAlarmOverride?:
Partial<cloudwatch.AlarmProps> - durationAlarmOverride?:
Partial<cloudwatch.AlarmProps>
BlackboxEndpoint
- name:
string(unique per construct) - url:
string(httporhttps) - schedule?:
Duration - timeout?:
Duration - method?:
'GET' | 'HEAD' | 'POST' | 'PUT' | 'PATCH' - body?:
string - expectedBodyContent?:
string - headers?:
{ [key: string]: string } - allowedStatusCodes?:
number[] - allowedStatusRanges?:
StatusRange[] - enableDurationAlarm?:
boolean - durationP95ThresholdMs?:
number - notificationTopic?:
sns.ITopic - alarmActions?:
cloudwatch.IAlarmAction[] - canaryPropsOverride?:
Partial<synthetics.CanaryProps> - failedAlarmOverride?:
Partial<cloudwatch.AlarmProps> - durationAlarmOverride?:
Partial<cloudwatch.AlarmProps> - customCanaryCode?:
synthetics.Code - customHandler?:
string
StatusRange
- min:
number - max:
number
EndpointResources
- canary:
synthetics.Canary - failedAlarm:
cloudwatch.Alarm - durationAlarm?:
cloudwatch.Alarm - notificationTopic?:
sns.ITopic - iamRole:
iam.Role
Troubleshooting
Validation errors at synth time
- Ensure endpoint names are unique and non-empty.
- Check that URLs use
httporhttps. - Make sure
defaultScheduleandscheduleare at least 1 minute. - Verify timeouts and
durationP95ThresholdMsare positive. - If you provide
customCanaryCode, you must also providecustomHandler.
No notifications received
- Ensure that at least one of
notificationTopic,alarmActions, or endpoint-specific equivalents is configured. - Verify SNS subscriptions (e.g. email confirmed, Slack/Chatbot configured).
- Ensure that at least one of
Canary names/roles/alarms conflicting
- Use
namePrefixto namespace resources per environment or application.
- Use
Development
# Install dependencies
npm install
# Build TypeScript
npm run build
# Run tests
npm test
# Bump version
npm version --no-git-tag-version x.x.xLicense
MIT
