@rghm/s3-replica
v0.1.3
Published
A utility to replicate objects from a source S3 bucket to a destination bucket using AWS SDK v3. It supports:
Readme
S3 Replica Module
A utility to replicate objects from a source S3 bucket to a destination bucket using AWS SDK v3. It supports:
- Checkpoint
- Starting replication from a specific key
- Stopping after a certain key
- Optional logging
Note: This package is intended for small-scale replication between S3 buckets. It is not optimized for large-scale transfers. For replicating a large number of objects efficiently, consider using tools like s5cmd.
Logic Overview
Initialization
- Creates S3 clients for both source and destination using the provided configurations.
- Tracks the last synced item (
lastSyncedItem) and the checkpoint counter.
Checkpoint
Maintains an anchor file (
__REPLICA_ANCHOR__) in the destination bucket to store the last successfully replicated key.Updates the checkpoint after processing a configurable number of objects (
checkpointInterval).If
ensureCheckpointExistsistrue(default), the module verifies that the checkpoint file exists in the source bucket.- Fails fast if the checkpoint is missing to prevent accidentally starting replication from the wrong point, especially when dealing with large files.
Replication Process
Fetches objects from the source bucket page by page using
paginateListObjectsV2.Processes each page sequentially:
- Downloads each object from the source bucket.
- Uploads it to the destination bucket.
- Updates the checkpoint.
Starts copying from the key specified in
startAfter.- If the starting file does not exist, the behavior depends on
ensureCheckpointExists: it either starts from the first available object or fails immediately.
- If the starting file does not exist, the behavior depends on
Stops replication if a
stopAfterkey is reached.Logs progress using the provided logger or
console.log.
Fault Handling
- Stops immediately on errors (no retries).
- Ensures checkpoint consistency across restarts by comparing with the anchor key in the destination bucket.
Configuration Options (ReplicaOptions)
| Option | Type | Default | Description |
| ------------------------ | --------------------------- | ------------- | --------------------------------------------------------------------------- |
| sourceConfig | S3ClientConfig | — | AWS SDK configuration for the source bucket. |
| destConfig | S3ClientConfig | — | AWS SDK configuration for the destination bucket. |
| sourceBucket | string | — | Name of the source bucket to replicate from. |
| destBucket | string | — | Name of the destination bucket to replicate to. |
| startAfter | string | "" | Key to start replication after (exclusive). |
| stopAfter | string? | — | Optional key at which replication should stop (inclusive). |
| logger | (...args: any[]) => void? | console.log | Optional custom logger function. |
| ensureCheckpointExists | boolean? | true | Fail immediately if the checkpoint key does not exist in the source bucket. |
| checkpointInterval | number? | 50 | Number of objects processed before saving the checkpoint. Max: 1000. |
