@renovosolutions/cdk-library-aurora-native-backup
v0.0.1
Published
AWS CDK construct library for Aurora backup and restore using ECS on a schedule, storing backups in S3.
Downloads
4
Readme
cdk-library-aurora-native-backup
A CDK construct library that creates and manages Docker images for Aurora PostgreSQL native backups using pg_dump.
The resulting images are designed for use with Amazon ECS Fargate for scalable, serverless backup operations.
Features
- Multi-Database Support: Back up multiple databases from the same Aurora cluster in a single service
- Pre-built Docker Image: Amazon Linux 2023 base with PostgreSQL 17 client tools and AWS CLI v2
- ECR Repository Management: Automatically creates and manages ECR repositories with security best practices
- Complete Backup Service: Ready-to-use ECS Fargate service for scheduled Aurora backups
- EFS and S3 Support: Built-in support for backing up to EFS with S3 sync
- Comprehensive Backup: Uses
pg_dumpdirectory format for efficient storage and simplified restore - Production Ready: Includes proper error handling, logging, and cleanup mechanisms
- Secure Authentication: Uses AWS Secrets Manager for database password management
API Doc
See API
Interface Structure
The library provides two main constructs, each with its own configuration interface:
AuroraBackupRepository(AuroraBackupRepositoryProps): Manages the ECR repository and Docker image for backups.AuroraNativeBackupService(AuroraNativeBackupServiceProps): Manages the backup service infrastructure (VPC, Aurora cluster, S3 bucket, compute resources, etc.), and uses:AuroraBackupConnectionProps: For database connection settings (username, database names array, password secret).
This separation allows for cleaner organization of image/repository management, connection credentials, and infrastructure settings.
Multi-Database Support
The library supports backing up multiple databases from the same Aurora PostgreSQL cluster in a single backup service. Simply provide an array of database names in the databaseNames property (defaults to ['postgres'] if not specified). Each database will be backed up separately and stored in its own S3 folder structure.
Database User Setup
Create a dedicated database user with read-only backup permissions on ALL databases to be backed up.
For PostgreSQL 14+ (recommended), use the built-in pg_read_all_data role for comprehensive read access:
-- Connect to each database and grant permissions
\c your_database_1;
GRANT CONNECT ON DATABASE your_database_1 TO backup_user;
GRANT pg_read_all_data TO backup_user;
-- Repeat for each additional database
\c your_database_2;
GRANT CONNECT ON DATABASE your_database_2 TO backup_user;
GRANT pg_read_all_data TO backup_user;The pg_read_all_data role automatically provides:
SELECTon all tables and viewsUSAGEon all schemasSELECTandUSAGEon all sequences- Access to future objects without requiring additional grants
Note: This library requires PostgreSQL 14 or newer for the pg_read_all_data role.
Shortcomings
- The backup service requires password-based authentication (no IAM database authentication for now)
- The backup container runs as a scheduled task, not continuously, so it cannot capture incremental changes
- Custom backup scripts are not currently supported, only the built-in
pg_dumpfunctionality - When backing up multiple databases, if one database backup fails, the task continues with the remaining databases but the overall task does not fail - individual database backup failures must be monitored through CloudWatch logs
Examples
Prerequisites
To use this construct, you must have:
- An AWS CDK stack with a defined environment (account and region)
- An existing VPC for the backup service
- An existing Aurora PostgreSQL database cluster
- An AWS Secrets Manager secret containing database credentials (recommended)
- A database user with the required backup permissions (see above)
Complete Backup Service (Recommended)
For most use cases, use the AuroraNativeBackupService which provides a complete, ready-to-use backup solution:
TypeScript
import { Stack, StackProps, aws_ec2 as ec2, aws_rds as rds, aws_secretsmanager as secretsmanager } from 'aws-cdk-lib';
import { Construct } from 'constructs';
import { AuroraNativeBackupService, AuroraBackupRepository } from '@renovosolutions/cdk-library-aurora-native-backup';
export class BackupServiceStack extends Stack {
constructor(scope: Construct, id: string, props: StackProps) {
super(scope, id, props);
// Your existing Aurora PostgreSQL database cluster and VPC
const vpc = ec2.Vpc.fromLookup(this, 'Vpc', { isDefault: true });
const dbCluster = rds.DatabaseCluster.fromDatabaseClusterAttributes(this, 'DbCluster', {
clusterIdentifier: 'my-production-cluster',
clusterEndpointAddress: 'cluster.xyz.region.rds.amazonaws.com',
port: 5432,
});
// First create the backup repository
const backupRepository = new AuroraBackupRepository(this, 'BackupRepository', {
repositoryName: 'aurora-postgres-backup',
});
// Secret containing the backup user's password
const backupUserSecret = secretsmanager.Secret.fromSecretAttributes(this, 'BackupUserSecret', {
secretArn: 'arn:aws:secretsmanager:region:account:secret:backup-user-password-abc123',
});
// Create the complete backup service
const backupService = new AuroraNativeBackupService(this, 'BackupService', {
cluster: dbCluster,
vpc,
backupBucketName: 'my-aurora-production-backups',
ecrRepository: backupRepository.repository,
connection: {
username: 'backup_user',
databaseNames: ['production', 'analytics', 'reporting'],
passwordSecret: backupUserSecret,
},
retentionDays: 30,
backupSchedule: '0 2 * * *', // Daily at 2 AM UTC
cpu: 1024, // Override default of 256
memoryLimitMiB: 2048, // Override default of 512
});
}
}Python
from aws_cdk import (
Stack,
aws_ec2 as ec2,
aws_rds as rds,
aws_secretsmanager as secretsmanager
)
from constructs import Construct
from cdk_library_aurora_native_backup import AuroraNativeBackupService, AuroraBackupRepository
class BackupServiceStack(Stack):
def __init__(self, scope: Construct, id: str, **kwargs):
super().__init__(scope, id, **kwargs)
# Your existing Aurora PostgreSQL database cluster and VPC
vpc = ec2.Vpc.from_lookup(self, "Vpc", is_default=True)
db_cluster = rds.DatabaseCluster.from_database_cluster_attributes(self, "DbCluster",
cluster_identifier="my-production-cluster",
cluster_endpoint_address="cluster.xyz.region.rds.amazonaws.com",
port=5432
)
# First create the backup repository
backup_repository = AuroraBackupRepository(self, "BackupRepository",
repository_name="aurora-postgres-backup"
)
# Secret containing the backup user's password
backup_user_secret = secretsmanager.Secret.from_secret_attributes(self, "BackupUserSecret",
secret_arn="arn:aws:secretsmanager:region:account:secret:backup-user-password-abc123"
)
# Create the complete backup service
backup_service = AuroraNativeBackupService(self, "BackupService",
cluster=db_cluster,
vpc=vpc,
backup_bucket_name="my-aurora-production-backups",
ecr_repository=backup_repository.repository,
connection={
"username": "backup_user",
"database_names": ["production", "analytics", "reporting"],
"password_secret": backup_user_secret
},
retention_days=30,
backup_schedule="0 2 * * *", # Daily at 2 AM UTC
cpu=1024, # Override default of 256
memory_limit_mi_b=2048 # Override default of 512
)Environment Variables
All environment variables used by the backup container are set automatically by the constructs. You do not need to set them manually.
| Environment Variable | Description | CDK Prop / Source |
|-----------------------|-----------------------------------------------------------|------------------------------------------|
| DB_HOST | Aurora PostgreSQL database cluster endpoint | cluster.clusterEndpoint.hostname |
| DB_NAMES | Array of database names to backup | connection.databaseNames |
| DB_USER | Database username | connection.username |
| DB_PASSWORD | Database password | connection.passwordSecret |
| AWS_REGION | AWS region | Stack.region |
| CLUSTER_IDENTIFIER | Cluster ID used as S3 path prefix (backups/{CLUSTER_IDENTIFIER}/) | cluster.clusterIdentifier |
| DB_PORT | Database port (default: 5432) | cluster.clusterEndpoint.port |
| BACKUP_ROOT | Backup directory (default: /mnt/aurora-backups) | (internal default) |
| S3_BUCKET | S3 bucket for backup sync | backupBucketName |
| S3_PREFIX | S3 prefix (default: backups) | (internal default) |
Backup Process
- Validation: Checks AWS credentials and creates backup directories
- Database Backup: For each database in the
DB_NAMESarray:- Uses
pg_dump --format=directorywith gzip compression (level 9) for each data file - Creates separate backup directory per database with date stamp
- If one database backup fails, continues with remaining databases
- Uses
- Verification: Validates each backup contains
toc.datfile - S3 Sync: Syncs each database backup to S3 bucket under separate database folders
- Cleanup: Removes local backups after successful S3 sync
Security Considerations
- ECR repositories created with image scanning enabled
- EFS encryption in transit supported
- IAM permissions follow principle of least privilege
- Use AWS Secrets Manager for database passwords in production
- Consider VPC endpoints for S3 to avoid internet traffic
Backup Storage Structure
Local EFS structure (per database):
/mnt/aurora-backups/
├── production/
│ └── YYYY-MM-DD/
│ ├── toc.dat # PostgreSQL table of contents
│ ├── ####.dat.gz # Compressed table data files
│ └── ####.dat.gz # Additional data files
├── analytics/
│ └── YYYY-MM-DD/
│ ├── toc.dat
│ └── ####.dat.gz
└── reporting/
└── YYYY-MM-DD/
├── toc.dat
└── ####.dat.gzS3 structure:
s3://my-backup-bucket/
└── backups/
└── {CLUSTER_IDENTIFIER}/
├── production/
│ └── YYYY-MM-DD/
│ ├── toc.dat
│ └── ####.dat.gz
├── analytics/
│ └── YYYY-MM-DD/
│ ├── toc.dat
│ └── ####.dat.gz
└── reporting/
└── YYYY-MM-DD/
├── toc.dat
└── ####.dat.gzRestoration
Interactive Restore CLI (Recommended)
This library includes an interactive TypeScript CLI that simplifies the restore process with auto-discovery and guided prompts:
npx ts-node restore_script/aurora-restore-cli.tsFeatures:
- Auto-discovery: Automatically finds S3 backup buckets using the
aurora_native_backup_bucket=truetag - Interactive selection: Guided prompts for cluster, database, backup date, and tables
- Table-level restore: Select specific tables or restore entire database
- Optimized downloads: Only downloads required backup files
- Ready-to-run commands: Generates and optionally executes
pg_restorecommands
Prerequisites:
- Node.js and TypeScript installed
- AWS credentials configured (via AWS CLI, environment variables, or IAM role)
pg_restorecommand available in your PATH- Network access to target PostgreSQL database
- Database user with restore permissions on target database:
CREATEprivilege (for creating tables, indexes, constraints)INSERTprivilege (for loading data)USAGEandCREATEon schemas- For full database restore:
CREATEDBprivilege or superuser role
Setup and Execution:
First, install dependencies:
cd restore_script
yarn installThen run the interactive CLI:
npx ts-node aurora-restore-cli.tsThe CLI will guide you through selecting your backup source, target database, and specific tables to restore.
Workflow:
- S3 Configuration: Auto-discovers backup bucket or prompts for manual entry
- Source Selection: Choose cluster, database, and backup date
- Table Selection: Select specific tables or full database restore
- Target Configuration: Enter target database connection details
- Execution: Downloads backup files and generates restore command
Manual Restoration
For advanced users or automation, backups are stored in S3 under organized paths:
s3://my-backup-bucket/backups/{CLUSTER_IDENTIFIER}/{DATABASE_NAME}/YYYY-MM-DD/Download backup files:
aws s3 cp --recursive s3://my-backup-bucket/backups/{CLUSTER_IDENTIFIER}/production/YYYY-MM-DD/ /path/to/backup/directory/Restore commands:
Full database restore:
pg_restore -h target-host -U username -d target_db -v -C /path/to/backup/directory/List backup contents:
pg_restore --list /path/to/backup/directory/Selective table restore:
pg_restore -h target-host -U username -d target_db -v -t table_name /path/to/backup/directory/Contributing
Contributions are welcome! Please follow these guidelines to help us maintain and improve the project:
Code Structure and Interfaces
- The main user-facing interfaces are:
AuroraBackupRepositoryPropsinsrc/aurora-backup-repository.tsAuroraNativeBackupServicePropsandAuroraBackupConnectionPropsinsrc/aurora-native-backup-service.ts
- All constructs and their configuration interfaces are defined in the
src/directory.
Code Generation and Project Tasks
This project uses projen for project management and code generation.
If you make changes to the project configuration (
.projenrc.ts), run:npx projenThis will regenerate all managed files, including
package.jsonand other configuration files.
Building and Testing
To build the project and run all tests, use:
yarn buildThis will compile the code, run unit tests, and ensure everything is up to date.
License
This project is licensed under the Apache License, Version 2.0 - see the LICENSE file for details.
