aml-regression-tests
v2.0.0
Published
AML regression tests repository for comparing behavior between stable and latest versions of the AML compiler.
Readme
@holistics/aml-regression-tests
AML regression tests repository for comparing behavior between stable and latest versions of the AML compiler.
High-level Architecture Design
Overview
This test suite compares the compilation results of AML files between two versions:
- Stable version:
@holistics/[email protected]with@holistics/[email protected] - Latest version:
@holistics/[email protected]with workspace@holistics/aml-std
The tests ensure that changes to the AML compiler don't break existing customer code by:
- Fetching AML files from customer repositories (via git service or local filesystem)
- Compiling the same files with both stable and latest versions
- Comparing the compilation results (excluding known differences like
__type__fields) - Testing serialized cache backward compatibility - ensuring latest compiler works with stable cache
- Generating detailed diff reports for any discrepancies
Test Types
The regression test suite now includes two distinct test types per tenant:
1. Regression Tests (compare results)
- Purpose: Ensure latest compiler produces same results as stable compiler
- Comparison:
stable resultsvslatest results - Detects: Breaking changes in compilation logic
2. Cache Compatibility Tests (check serialized cache compatibility)
- Purpose: Ensure latest compiler works consistently with stable's serialized cache
- Comparison:
latest results (fresh)vslatest results (using stable cache) - Detects: Serialized cache format changes or backward compatibility issues
Key Insight: A tenant can pass regression tests but fail cache compatibility (or vice versa), providing targeted debugging information.
Detailed implementation: docs/serialized-cache-compatibility-plan.md
How It Works
Test Flow Architecture
sequenceDiagram
participant User
participant TestController
participant Vitest
participant Database
participant GitService
participant Artifacts
participant Slack
alt Controller Mode
User->>TestController: node testController.ts
TestController->>Slack: Send start notification
TestController->>TestController: getRepoTenantPaths()
TestController->>Database: Query tenant repositories
Database-->>TestController: Tenant data
loop For each tenant (sequential)
TestController->>Vitest: Spawn vitest process for tenant
alt Git Mode
Vitest->>GitService: Fetch AML files
GitService-->>Vitest: AML file contents
else Local Mode
Vitest->>Vitest: Read local files
end
Vitest->>Vitest: compileLatest(files)
Vitest->>Vitest: compileStable(files) → {results, cache}
Vitest->>Vitest: compileLatestWithStableCache(files, cache)
Vitest->>Vitest: isEqualExclude(stable, latest)
Vitest->>Vitest: isEqualExclude(latest, latest-with-cache)
Vitest->>Artifacts: Write diff files (on mismatch)
Vitest->>Artifacts: Write tenant results JSON
end
TestController->>Slack: Aggregate results & notify end
TestController->>Slack: Upload artifact files
else Direct Mode
User->>Vitest: pnpm vitest run
Note over Vitest: Same tenant processing loop
Vitest->>Artifacts: Write single output file
endCore Components
Connector (
tests/connector.ts):- Database queries to find tenant repositories
- Git service integration for fetching AML files
- Local filesystem reading for development
Helpers (
tests/helpers.ts):- AML compilation using stable and latest versions
- Serialized cache compatibility testing with
compileLatestWithStableCache() - Deep object comparison with exclusion rules
- Diff collection integration with artifact system
Artifact Management (
tests/artifact.ts):- Consolidated diff collection and per-tenant flushing
- Directory structure management (
diffs/subfolder) - Memory-efficient diff aggregation across test runs
Test Controller (
tests/testController.ts):- Parallel test execution management (bounded via
p-limit) - Process spawning and coordination
- Result aggregation with failure reporting
- Parallel test execution management (bounded via
Slack Integration (
tests/slack.ts):- Test start/end notifications
- Granular reporting with separate regression and cache compatibility results
- Result reporting with pass/fail counts per test type
- Test report uploads (individual JSON files)
- Compressed diff archive uploads (
diffs.zip)
Installation
pnpm installDependencies Added for Artifact Management
adm-zip: Compression library for creatingdiffs.ziparchives- Enables clean Slack uploads with consolidated diff files
Configuration
Create environment config file:
cp .env.sample .envOperating Modes
The system operates in two distinct modes with different dependencies and use cases:
1. Local Development Mode
Use Case: Testing changes during development with local AML files.
# Enable local mode
READ_LOCAL=true
# Local directory paths containing AML files
LOCAL_PATHS=["./local/tenant1", "./local/tenant2"]Features:
- Reads AML files directly from local filesystem
- No database or git service dependencies
- Tenant IDs derived from directory names
- Fast iteration for development
- Ideal for testing compiler changes
2. Production/Git Mode (Default)
Use Case: Testing against customer repositories via database and git service.
# Database connection (required for this mode)
DB_USERNAME=your_username
DB_PASSWORD=your_password
DB_HOST=localhost
DB_PORT=5432
DB_DATABASE=your_database
# Git service configuration
GIT_SERVICE_URL=http://0.0.0.0:8080
REPO_PATH_PREFIX=/opt/holistics/git_data/repositories/
# Test targeting
TENANT_IDS=["123", "456", "789"] # Specific tenant IDs
# OR
RUN_ALL=true # All active tenants
# Concurrency (optional)
NUMBER_OF_TEST_PROCESSES=4 # Spawn up to 4 tenants in parallelFeatures:
- Queries customer repositories from database
- Fetches AML files via git service using commit IDs
- Supports both specific tenant testing and bulk processing
- Production-like testing environment
Scaling & Distribution
Multiple instances can run simultaneously with different tenant ranges:
# Instance 1: Handles tenants 1-100
OFFSET=0
LIMIT=100
# Instance 2: Handles tenants 101-200
OFFSET=100
LIMIT=100
# Instance 3: Handles tenants 201-300
OFFSET=200
LIMIT=100Within a single controller process, the pool size is controlled by NUMBER_OF_TEST_PROCESSES. Increase it to drive more concurrent tenants per node, or drop it back to 1 when memory-constrained.
Key Features:
- Only applies when
RUN_ALL=true(ignored withTENANT_IDS) - Database query uses
ORDER BY tenant_idfor consistent tenant allocation - Enables distributed processing across multiple containers/machines
- Each instance processes tenants sequentially for memory efficiency
- Artifact files labeled:
container-${OFFSET}-${LIMIT}_tenant_${tenantId}.json
Scaling Example
# Deploy 3 instances for distributed processing
# Instance 1: Processes 100 tenants sequentially
OFFSET=0 LIMIT=100
# Instance 2: Processes 100 tenants sequentially
OFFSET=100 LIMIT=100
# Instance 3: Processes 100 tenants sequentially
OFFSET=200 LIMIT=100Output & Notifications
Artifacts Configuration
# Directory for test results and diff files
ARTIFACT_DIR=./test-results
# Output file name (optional for single process mode)
ARTIFACT_FILE=test-output.jsonSlack Integration (Optional)
SLACK_CHANNEL_ID=C1234567890
SLACK_TOKEN=xoxb-your-slack-tokenNotification Features:
- Test start/end notifications with container labels
- Pass/fail summary with tenant ID lists
- Automatic artifact file uploads for detailed analysis
Diff Generation Options
# Use system 'diff' command for better diff quality (default: true)
OFFLINE_DIFF=trueOffline Diff Features:
- Enhanced diff quality: Uses system
diffcommand instead of JavaScript diff library - Better performance: More efficient for large JSON files
- Unified format: Generates standard unified diff format
- Automatic fallback: Falls back to JavaScript diff if system
diffunavailable - Local temp files: Creates temporary files in
.tmp/directory within repo (auto-cleaned)
How to Run Tests
1. Direct Vitest Execution
Best for: Development, debugging, small tenant sets
# Configure your mode (see Operating Modes section above)
export READ_LOCAL=true # or configure production mode
# Run single process
pnpm vitest runCharacteristics:
- Direct vitest execution with detailed console output
- Single output file:
${ARTIFACT_FILE}(default:test-output.json) - Easier debugging and development iteration
- Suitable for small tenant sets or development
2. Test Controller Execution
Best for: Large tenant sets, production testing
# Configure your mode
export RUN_ALL=true
# Run via test controller (processes tenants sequentially)
node tests/testController.tsCharacteristics:
- Processes tenants sequentially for memory efficiency
- Multiple output files:
container-${OFFSET}-${LIMIT}_tenant_${tenantId}.json - Automatic Slack notifications (if configured)
- Avoids memory accumulation issues
3. Docker Execution
Best for: Production-like environment, CI/CD
# Setup credentials
echo '@holistics:registry=https://npm.pkg.github.com/
//npm.pkg.github.com/:_authToken=<token>' > .npmrc
# Build and run
docker compose build
docker compose up --abort-on-container-exitCharacteristics:
- Includes git service container automatically
- Isolated environment with consistent dependencies
- Volume mounting for accessing results
- Production-like configuration
Execution Method Comparison
| Aspect | Direct Vitest | Test Controller | Docker |
|--------|---------------|-----------------|--------|
| Command | pnpm vitest run | node tests/testController.ts | docker compose up |
| Processes | 1 vitest process | 1 vitest process per tenant | Sequential processing |
| Output Files | Single file | Multiple files | Multiple files |
| Git Service | External dependency | External dependency | Included |
| Best For | Development/Debug | Large scale testing | Production/CI |
| Slack Notifications | ❌ | ✅ | ✅ |
Test Results & Artifacts
Output Structure
Each test run generates:
Test Output JSON: Vitest results with tenant metadata
- Location:
${ARTIFACT_DIR}/${ARTIFACT_FILE}or${ARTIFACT_DIR}/container-${OFFSET}-${LIMIT}_process_${N}.json - Contains: Test results, pass/fail status, tenant IDs
- Location:
Consolidated Diff Files: Aggregated comparison reports per tenant
- Location:
${ARTIFACT_DIR}/diffs/diff_${tenantId}_${repoName}.json - Contains: Multiple diff entries per tenant, consolidated metadata
- Structure: All failed comparisons for a tenant grouped in single file
- Location:
Compressed Diff Archive: Zip file for Slack upload
- Location:
${ARTIFACT_DIR}/diffs.zip - Contains: All files from
diffs/folder compressed - Purpose: Single file upload to Slack for easy download/sharing
- Location:
Example Consolidated Diff File Structure
{
"tenantId": "123",
"repoName": "789",
"repoPath": "tenant123/projects/456/789",
"commitId": "abc123def456",
"timestamp": "2025-01-23T10:30:00.000Z",
"diffs": [
{
"label": "/models/users.aml",
"expected": { /* stable compilation result */ },
"actual": { /* latest compilation result */ },
"unifiedDiff": "--- expected\n+++ actual\n@@ -1,4 +1,4 @@\n..."
},
{
"label": "/datasets/orders.aml",
"expected": { /* stable compilation result */ },
"actual": { /* latest compilation result */ },
"unifiedDiff": "--- expected\n+++ actual\n@@ -10,2 +10,3 @@\n..."
}
]
}Key Features:
- Consolidated: Multiple failed comparisons per tenant in single file
- Organized: All diffs for a tenant/repository grouped together
- Compressed:
diffs.zipcontains all tenant diff files for Slack upload - Memory-efficient: Diffs collected in memory and flushed per tenant
Slack Notifications
When configured, the system sends:
- Start Notification: Container label and test initiation
- End Notification:
- Granular pass/fail summary with separate regression and cache compatibility results
- Format:
✅ Regression Passed: [tenant1, tenant2]and✅ Cache Compatible: [tenant1, tenant3] - Test Reports: Individual JSON files uploaded for quick inspection
- Diff Archive: Single
diffs.zipuploaded containing all diff files - Clean thread organization (test reports + compressed diffs)
Troubleshooting
Common Issues
Git Service Connection Failed
Error: connect ECONNREFUSED 127.0.0.1:8080- Ensure git service is running
- Check
GIT_SERVICE_URLconfiguration - For Docker: verify service dependency
Database Connection Failed
Error: password authentication failed- Verify database credentials in
.env - Ensure database is accessible
- For Docker: check
host.docker.internalconnectivity
- Verify database credentials in
Out of Memory Errors
JavaScript heap out of memory- The testController uses
--max-old-space-size=8192and processes tenants sequentially - Consider testing fewer tenants per batch using
OFFSETandLIMIT - Each tenant gets a fresh process to avoid memory accumulation
- The testController uses
Permission Errors (Docker)
Error: EACCES: permission denied- Ensure proper file permissions on mounted volumes
- Consider using
user: "1000:1000"in docker-compose.yml
Debug Mode
For detailed debugging:
# Enable verbose output
export DEBUG=1
# Run single tenant with local files
export READ_LOCAL=true
export LOCAL_PATHS='["./debug-tenant"]'
pnpm vitest run --reporter=verbosePerformance Tuning
- For large tenant sets: Use
OFFSETandLIMITto process in batches across multiple instances - For distributed processing: Deploy multiple instances with different tenant ranges
- For memory efficiency: Sequential processing eliminates memory accumulation issues
