npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@aws-mdaa/datalake

v1.4.0

Published

MDAA datalake module

Readme

Data Lake

This Data Lake CDK application is used to configure deploy the resources required to define a secure S3-based Data Lake on AWS.


Deployed Resources and Compliance Details

DataLake

Data Lake KMS Key - This key will be used to encrypt all Data Lake resources which support encryption at rest (including the Data Lake S3 Buckets).

  • Key usage access granted to all data lake roles (via key policy)
  • Encrypt access granted to S3 service to allow S3 Inventory data to be written (via key policy)
  • Additional permissions may be granted via IAM policy

Data Lake S3 Buckets - These buckets will be deployed to form the persistence basis of the Data Lake.

  • Bucket versioning will be automatically enabled
    • Only super user level access may delete object versions (permanent deletion)
    • Write access otherwise only allows creation of delete markers
  • Bucket will be encrypted by default using the Data Lake KMS Key
    • By default, exclusive use of Data Lake KMS key will be enforced via bucket policy
    • If exclusive use not enforced, an alternative KMS key may be specified
    • BucketKey feature enabled to minimize impact on KMS Service during high volume read/write operations
  • Bucket policy will enforce use of SSL
  • Access policies statements (in Bucket Policy) are configured per prefix, and can be read, write, and super user level
  • By default, a defaultDeny bucket policy statement will be added to deny bucket read/write actions to any role not specified in the config
    • Configurable by bucket
  • Each bucket may have S3 Inventory enabled to automatically produce inventory either on the bucket, or written to an external bucket
  • Each bucket may have S3 Lifecycle rules attached

S3 Lifecycle Rules - A set of lifecycle rule configurations which can be applied across data lake buckets

Configuration

MDAA Config

Add the following snippet to your mdaa.yaml under the modules: section of a domain/env in order to use this module:

          datalake: # Module Name can be customized
            module_path: "@aws-caef/datalake" # Must match module NPM package name
            module_configs:
              - ./datalake.yaml # Filename/path can be customized

Module Config (./datalake.yaml)

Config Schema Docs

roles:
  DataAdmin: # The Logical Config Role name
    # A list of role ids or SSM params which specify which physical roles will be bound to the Logical Config Role
    - arn: arn:{{partition}}:iam::{{account}}:role/Admin
    - name: Admin
    - id: AROA1234567890
  DataUser:
    - id: ssm:/sample-org/instance1/generated-role/test-role/id
    - arn: ssm:/sample-org/instance1/generated-role/data-scientist/arn
    - id: generated-role-id:test-role
    - id: generated-role-id:data-scientist
    # Relsolves to the role generated by IAM Identity Center/SSO for the 'data_scientist' permission_set
    - name: data_scientist
      sso: true
      
# Definitions of access policies which grant access to S3 paths for specified Logical Config Roles.
# These Access Policies can then be applied to Data Lake buckets (they will be injected into the corresponding bucket policies.)
accessPolicies:
  Root: # A friendly name for the access policy
    rule:
      prefix: / # The S3 prefix path to which policy will be applied in the bucket policies.
      # A list of Logical Config Roles which will be provided ReadWriteSuper access.
      # ReadWriteSuper access allows reading, writing, and permanent data deletion.
      ReadWriteSuperRoles:
        - DataAdmin
  Data: # A friendly name for the access policy
    rule:
      prefix: /data
      ReadRoles:
        - DataUser

lifecycleConfigurations:
  SampleConfiguration1: # A friendly name for life cycle transition rules configuration.
    SampleRule1:
      Status: Enabled # Enabled or disabled
      Prefix: test_prefix # (Optional) Prefix within S3 bucket to which the rule applies.
      ObjectSizeGreaterThan: 500 # (Optional)
      ObjectSizeLessThan: 10000 # (Optional)
      AbortIncompleteMultipartUploadAfter: 2 # (Optional) Number of days after initiation of multi part creation.
      Transitions: # (Optional) Storage class to move the current version of objects to after after object upload.
        - Days: 30 # (Optional) Number of days after object creation.
          StorageClass: STANDARD_IA # (Optional) Storage class to move the object to
        - Days: 60
          StorageClass: GLACIER_IR
        - Days: 150
          StorageClass: GLACIER
        - Days: 240
          StorageClass: DEEP_ARCHIVE
      ExpirationDays: 270 # (Optional) Number of days. Current version of object will expire these many days after object creation.
      # ExpiredObjectDeleteMarker: True # Permanently delete expired objects. Cannot be set if ExpirationDays is set
      NoncurrentVersionTransitions: # (Optional) Storage class to move the previous versions of objects to after after object upload.
        - Days: 30 # (Optional) Number of days after object creation.
          StorageClass: STANDARD_IA # (Optional) Storage class to move the object to
          NewerNoncurrentVersions: 1 # (Optional) Number of latest non-current versions to retain.
        - Days: 60
          StorageClass: GLACIER_IR
          NewerNoncurrentVersions: 2
        - Days: 150
          StorageClass: GLACIER
          NewerNoncurrentVersions: 3
        - Days: 240
          StorageClass: DEEP_ARCHIVE
          NewerNoncurrentVersions: 4
      NoncurrentVersionExpirationDays: 270 # (Optional) Number of days. Non-current object will expire these many days after object creation.
      NoncurrentVersionsToRetain: 5 # (Optional) Number of latest non-current versions to retain.
    SampleRule2:
      Status: Enabled # Enabled or disabled
      Prefix: test_prefix # (Optional) Prefix within S3 bucket to which the rule applies.
      ObjectSizeGreaterThan: 500 # (Optional)
      ObjectSizeLessThan: 10000 # (Optional)
      AbortIncompleteMultipartUploadAfter: 2 # (Optional) Number of days after initiation of multi part upload
      Transitions:
        - Days: 30
          StorageClass: STANDARD_IA
        - Days: 60
          StorageClass: GLACIER_IR
        - Days: 150
          StorageClass: GLACIER
        - Days: 240
          StorageClass: DEEP_ARCHIVE
      ExpiredObjectDeleteMarker: True
      NoncurrentVersionTransitions:
        - Days: 30
          StorageClass: STANDARD_IA
          NewerNoncurrentVersions: 1
        - Days: 60
          StorageClass: GLACIER_IR
          NewerNoncurrentVersions: 2
        - Days: 150
          StorageClass: GLACIER
          NewerNoncurrentVersions: 3
        - Days: 240
          StorageClass: DEEP_ARCHIVE
          NewerNoncurrentVersions: 4
      NoncurrentVersionExpirationDays: 270 # Number of days. Non-current object will expire these many days after object creation.
      NoncurrentVersionsToRetain: 5
  SampleConfiguration2: # A friendly name for life cycle transition rules configuration.
    SampleRule1:
      Status: Enabled # Enabled or disabled
      Prefix: test_prefix # (Optional) Prefix within S3 bucket to which the rule applies.
      Transitions: # (Optional) Storage class to move the current version of objects to after after object upload.
        - Days: 30 # (Optional) Number of days after object creation.
          StorageClass: STANDARD_IA # (Optional) Storage class to move the object to
    SampleRule2:
      Status: Enabled # Enabled or disabled
      Prefix: test_prefix # (Optional) Prefix within S3 bucket to which the rule applies.
      NoncurrentVersionTransitions:
        - Days: 30
          StorageClass: STANDARD_IA
          NewerNoncurrentVersions: 1

# The set of S3 buckets which will be created, and the access policies which will be applied.
buckets:
  raw:
    defaultDeny: false
    #enableEventBridgeNotifications: true
    createFolderSkeleton: false
    #Inventory data will be written for each listed name/prefix under /inventory/<name>
    inventories:
      all-data:
        prefix: data
    accessPolicies:
      - Root
      - Data
    lifecycleConfiguration: SampleConfiguration1

  transformed:
    createFolderSkeleton: true
    enableEventBridgeNotifications: true
    lakeFormationLocations:
      all-data:
        prefix: data
      all-data2:
        prefix: data2
    accessPolicies:
      - Root
      - Data
    lifecycleConfiguration: SampleConfiguration2