@wundr.io/analysis-engine
v1.0.6
Published
Advanced code analysis engine with AST parsing, duplicate detection, complexity metrics, and AI integration
Downloads
441
Maintainers
Readme
@wundr.io/analysis-engine
Enterprise-grade code analysis engine with advanced AST parsing, intelligent duplicate detection, comprehensive complexity metrics, and high-performance optimizations. Built for analyzing large-scale codebases with memory efficiency and blazing-fast execution.
Overview
The Analysis Engine is a sophisticated TypeScript/JavaScript code analysis toolkit that combines six powerful analysis engines with cutting-edge performance optimizations. Designed to handle massive codebases with ease, it delivers actionable insights while maintaining minimal memory footprint and maximum throughput.
Key Features
- Six Advanced Analysis Engines: Comprehensive code quality assessment
- High-Performance Architecture: 15,000+ files/second processing speed
- Memory Efficient: <250MB memory usage for large codebases
- Concurrent Processing: 30+ concurrent workers with intelligent load balancing
- Streaming Analysis: 60-80% memory reduction with streaming processors
- Real-time Monitoring: Built-in memory and performance monitoring
- Enterprise Ready: Production-grade error handling and resilience
- Rich Reporting: JSON, HTML, Markdown, and CSV output formats
Performance Highlights
| Metric | Performance | | ---------------------- | -------------------------- | | Processing Speed | 15,000+ files/second | | Memory Usage | <250MB for large codebases | | Concurrent Workers | 30+ with auto-scaling | | Memory Reduction | 60-80% with streaming | | Throughput | 4.4x faster than baseline | | Cache Hit Rate | 85%+ with object pooling |
Installation
npm install @wundr.io/analysis-enginePeer Dependencies
npm install typescript@^5.5.0Quick Start
Basic Analysis
import { AnalysisEngine, analyzeProject } from '@wundr.io/analysis-engine';
// Simple analysis
const report = await analyzeProject('/path/to/project', {
outputFormats: ['json', 'html'],
includeTests: false,
});
console.log(`Analyzed ${report.summary.totalFiles} files`);
console.log(`Found ${report.duplicates.clusters.length} duplicate clusters`);
console.log(`Average complexity: ${report.complexity.averageCyclomaticComplexity}`);Advanced Usage with Progress Tracking
import { AnalysisEngine, analyzeProjectWithProgress } from '@wundr.io/analysis-engine';
const report = await analyzeProjectWithProgress(
'/path/to/project',
progress => {
console.log(`[${progress.type}] ${progress.message}`);
if (progress.percentage) {
console.log(`Progress: ${progress.percentage}%`);
}
},
{
performance: {
maxConcurrency: 30,
enableStreaming: true,
enableMemoryOptimization: true,
},
duplicateDetection: {
minSimilarity: 0.8,
enableSemanticAnalysis: true,
},
complexity: {
maxCyclomaticComplexity: 10,
maxCognitiveComplexity: 15,
},
}
);Custom Engine Configuration
const engine = new AnalysisEngine({
targetDir: '/path/to/project',
// File filtering
exclude: ['**/*.spec.ts', '**/node_modules/**'],
includeTests: false,
// Performance tuning
performance: {
maxConcurrency: 30,
chunkSize: 100,
enableStreaming: true,
enableMemoryOptimization: true,
memoryLimit: 250 * 1024 * 1024, // 250MB
},
// Output configuration
outputFormats: ['json', 'html', 'markdown'],
outputDir: './analysis-reports',
// Enable optimizations
useOptimizations: true,
});
const report = await engine.analyze();Analysis Engines
1. AST Parser Engine
Advanced TypeScript/JavaScript AST parsing with comprehensive entity extraction.
Capabilities:
- Classes, interfaces, types, and enums
- Functions, methods, and arrow functions
- Variables, constants, and exports
- JSDoc documentation extraction
- Dependency graph construction
- Signature and metadata analysis
Example:
import { ASTParserEngine } from '@wundr.io/analysis-engine';
const parser = new ASTParserEngine();
const entities = await parser.analyze(['src/**/*.ts'], config);
console.log(`Found ${entities.length} entities`);
entities.forEach(entity => {
console.log(`${entity.type}: ${entity.name} (${entity.file}:${entity.line})`);
});2. Duplicate Detection Engine
Intelligent duplicate code detection with semantic and structural analysis.
Features:
- Hash-based clustering
- Semantic similarity analysis
- Structural pattern matching
- Fuzzy matching for near-duplicates
- Consolidation recommendations
Memory-Optimized Version:
import { OptimizedDuplicateDetectionEngine } from '@wundr.io/analysis-engine';
const duplicateEngine = new OptimizedDuplicateDetectionEngine({
minSimilarity: 0.8,
enableSemanticAnalysis: true,
enableStructuralAnalysis: true,
enableStreaming: true,
maxMemoryUsage: 200 * 1024 * 1024, // 200MB
clusteringAlgorithm: 'hash',
});
const clusters = await duplicateEngine.analyze(entities, config);
clusters.forEach(cluster => {
console.log(`\nDuplicate cluster (${cluster.similarity * 100}% similar):`);
cluster.entities.forEach(entity => {
console.log(` - ${entity.file}:${entity.line} (${entity.name})`);
});
if (cluster.consolidationSuggestion) {
console.log(`Suggestion: ${cluster.consolidationSuggestion.strategy}`);
console.log(`Effort: ${cluster.consolidationSuggestion.estimatedEffort}`);
}
});3. Complexity Metrics Engine
Comprehensive complexity analysis with multiple metrics and thresholds.
Metrics Calculated:
- Cyclomatic Complexity: Control flow complexity
- Cognitive Complexity: Mental effort required to understand code
- Maintainability Index: Overall maintainability score (0-100)
- Nesting Depth: Maximum nesting level
- Function Size: Lines of code and parameter count
- Technical Debt: Estimated hours to address complexity issues
Example:
import { ComplexityMetricsEngine } from '@wundr.io/analysis-engine';
const complexityEngine = new ComplexityMetricsEngine({
cyclomatic: { low: 5, medium: 10, high: 20, critical: 30 },
cognitive: { low: 7, medium: 15, high: 25, critical: 40 },
maintainability: { excellent: 85, good: 70, moderate: 50, poor: 25 },
nesting: { maxDepth: 4, warningDepth: 3 },
size: { maxLines: 100, maxParameters: 5 },
});
const report = await complexityEngine.analyze(entities, config);
console.log(`Average Cyclomatic: ${report.overallMetrics.averageCyclomaticComplexity}`);
console.log(`Average Cognitive: ${report.overallMetrics.averageCognitiveComplexity}`);
console.log(`Technical Debt: ${report.overallMetrics.totalTechnicalDebt} hours`);
// Complexity hotspots
report.complexityHotspots.forEach((hotspot, index) => {
console.log(`\n${index + 1}. ${hotspot.entity.name} (Score: ${hotspot.rank})`);
console.log(` Cyclomatic: ${hotspot.complexity.cyclomatic}`);
console.log(` Cognitive: ${hotspot.complexity.cognitive}`);
console.log(` Maintainability: ${hotspot.complexity.maintainability}`);
console.log(` Issues: ${hotspot.issues.join(', ')}`);
console.log(` Recommendations:`);
hotspot.recommendations.forEach(rec => console.log(` - ${rec}`));
});4. Circular Dependency Engine
Detects and analyzes circular dependencies in your codebase.
Features:
- Dependency graph construction
- Cycle detection with depth analysis
- Impact assessment
- Break point suggestions
- Severity classification
Example:
import { CircularDependencyEngine } from '@wundr.io/analysis-engine';
const circularEngine = new CircularDependencyEngine();
const cycles = await circularEngine.analyze(entities, config);
cycles.forEach(cycle => {
console.log(`\nCircular dependency (depth: ${cycle.depth}):`);
console.log(`Path: ${cycle.cycle.join(' -> ')}`);
console.log(`Severity: ${cycle.severity}`);
console.log(`Files involved: ${cycle.files.join(', ')}`);
console.log(`Suggestions:`);
cycle.suggestions.forEach(s => console.log(` - ${s}`));
});5. Code Smell Engine
Identifies common code smells and anti-patterns.
Detected Smells:
- Long methods (>100 lines)
- Large classes (>15 methods)
- Duplicate code blocks
- Dead/unreachable code
- Complex conditionals
- Feature envy
- Inappropriate intimacy
- God objects
Example:
import { CodeSmellEngine } from '@wundr.io/analysis-engine';
const smellEngine = new CodeSmellEngine();
const smells = await smellEngine.analyze(entities, config);
smells.forEach(smell => {
console.log(`\n[${smell.severity}] ${smell.type}`);
console.log(`File: ${smell.file}:${smell.line}`);
console.log(`Message: ${smell.message}`);
console.log(`Suggestion: ${smell.suggestion}`);
});6. Unused Export Engine
Finds exported entities that are never imported elsewhere.
Features:
- Cross-file import tracking
- Public API detection
- Test file exclusion options
- Usage frequency analysis
Example:
import { UnusedExportEngine } from '@wundr.io/analysis-engine';
const unusedEngine = new UnusedExportEngine();
const unused = await unusedEngine.analyze(entities, config);
console.log(`Found ${unused.length} unused exports`);
unused.forEach(entity => {
console.log(`${entity.name} in ${entity.file}:${entity.line}`);
});Performance Optimizations
Worker Pool Management
Intelligent concurrent processing with auto-scaling workers.
import { WorkerPoolManager } from '@wundr.io/analysis-engine';
const workerPool = new WorkerPoolManager({
minWorkers: 4,
maxWorkers: 30,
idleTimeout: 60000,
taskTimeout: 300000,
enableAutoScaling: true,
resourceThresholds: {
cpu: 0.85,
memory: 0.9,
},
});
// Execute tasks concurrently
const results = await Promise.all(tasks.map(task => workerPool.execute(task)));
// Monitor performance
const metrics = workerPool.getMetrics();
console.log(`Active workers: ${metrics.activeWorkers}`);
console.log(`Queue size: ${metrics.queueSize}`);
console.log(`Throughput: ${metrics.throughput} tasks/sec`);
console.log(`Error rate: ${metrics.errorRate}%`);
await workerPool.shutdown();Streaming File Processor
Process large codebases with minimal memory footprint.
import { StreamingFileProcessor } from '@wundr.io/analysis-engine';
const processor = new StreamingFileProcessor({
batchSize: 100,
maxConcurrency: 10,
enableBackpressure: true,
highWaterMark: 1000,
lowWaterMark: 100,
});
processor.on('batch', batch => {
console.log(`Processing batch of ${batch.length} files`);
});
processor.on('progress', progress => {
console.log(`Processed ${progress.processed}/${progress.total} files`);
});
const results = await processor.processFiles(['src/**/*.ts'], async file => {
// Process each file
return analyzeFile(file);
});
console.log(`Processed ${results.length} files`);
console.log(`Peak memory: ${processor.getMemoryStats().peakUsage / 1024 / 1024} MB`);Memory Monitor
Track memory usage and prevent leaks.
import { MemoryMonitor } from '@wundr.io/analysis-engine';
const monitor = new MemoryMonitor({
snapshotInterval: 5000, // 5 seconds
maxSnapshots: 200,
enableLeakDetection: true,
heapDumpThreshold: 0.9, // 90% of max memory
maxMemory: 500 * 1024 * 1024, // 500MB
});
monitor.on('warning', data => {
console.warn(`Memory warning: ${data.message}`);
console.warn(`Current usage: ${data.usage / 1024 / 1024} MB`);
});
monitor.on('critical', data => {
console.error(`Critical memory state: ${data.message}`);
// Trigger cleanup or halt processing
});
monitor.start();
// Your analysis code here
const stats = monitor.getStats();
console.log(`Peak memory: ${stats.peakUsage / 1024 / 1024} MB`);
console.log(`GC events: ${stats.gcEvents}`);
console.log(`Average heap: ${stats.averageHeap / 1024 / 1024} MB`);
monitor.stop();CLI Integration
The analysis engine includes a powerful command-line interface.
Installation
npm install -g @wundr.io/analysis-engineCommands
# Analyze a codebase
wundr-analyze analyze ./src
# With options
wundr-analyze analyze ./src \
--output ./reports \
--format json,html,markdown \
--max-complexity 10 \
--min-similarity 0.8 \
--concurrency 30 \
--enable-ai \
--verbose
# Exclude patterns
wundr-analyze analyze ./src \
--exclude "**/*.spec.ts,**/*.test.ts"
# Include test files
wundr-analyze analyze ./src --include-testsCLI Options
| Option | Description | Default |
| ------------------ | --------------------------------------- | ------------------- |
| -o, --output | Output directory for reports | ./analysis-output |
| -f, --format | Output formats (json,html,markdown,csv) | json,html |
| --include-tests | Include test files in analysis | false |
| --exclude | Additional exclude patterns | - |
| --max-complexity | Max cyclomatic complexity threshold | 10 |
| --min-similarity | Min similarity for duplicates | 0.8 |
| --concurrency | Max concurrent file processing | 10 |
| --enable-ai | Enable AI-powered analysis | false |
| --verbose | Enable verbose output | false |
Benchmark Suite
Comprehensive performance benchmarking for optimizations.
import { PerformanceBenchmarkSuite } from '@wundr.io/analysis-engine';
const benchmark = new PerformanceBenchmarkSuite({
testDataSets: [
{
name: 'Small Project',
fileCount: 100,
avgFileSize: 5000,
complexity: 'low',
duplicateRatio: 0.1,
},
{
name: 'Medium Project',
fileCount: 1000,
avgFileSize: 8000,
complexity: 'medium',
duplicateRatio: 0.2,
},
{
name: 'Large Project',
fileCount: 10000,
avgFileSize: 10000,
complexity: 'high',
duplicateRatio: 0.3,
},
],
iterations: 5,
outputDir: './benchmarks',
enableProfiling: true,
memoryLimit: 500 * 1024 * 1024,
concurrencyLevels: [1, 5, 10, 20, 30],
});
// Run benchmarks
const results = await benchmark.runFullSuite();
// Display results
console.log('\nBenchmark Results:');
console.log(`Speedup: ${results.improvement.speedup}x`);
console.log(`Memory reduction: ${results.improvement.memoryReduction}%`);
console.log(`Throughput increase: ${results.improvement.throughputIncrease}%`);
console.log(`Overall score: ${results.improvement.overallScore}`);
// Generate report
await benchmark.generateReport(results, 'html');Benchmark Metrics
- Execution Time: Total analysis duration
- Throughput: Files processed per second
- Memory Usage: Peak and average memory consumption
- CPU Usage: Average and peak CPU utilization
- Concurrency Efficiency: Worker pool utilization
- Cache Performance: Hit rates and efficiency
- Error Rate: Failed operations percentage
Configuration Options
Analysis Config
interface AnalysisConfig {
// Target configuration
targetDir: string;
exclude: string[];
includeTests: boolean;
// Output configuration
outputFormats: ('json' | 'html' | 'markdown' | 'csv')[];
outputDir: string;
// Performance tuning
performance: {
maxConcurrency: number;
chunkSize: number;
enableStreaming: boolean;
enableMemoryOptimization: boolean;
memoryLimit: number;
};
// Duplicate detection
duplicateDetection: {
minSimilarity: number;
enableSemanticAnalysis: boolean;
enableStructuralAnalysis: boolean;
clusteringAlgorithm: 'hash' | 'hierarchical' | 'density';
};
// Complexity thresholds
complexity: {
maxCyclomaticComplexity: number;
maxCognitiveComplexity: number;
maxNestingDepth: number;
maxFunctionLength: number;
maxParameters: number;
};
// AI features
enableAIAnalysis: boolean;
aiConfig: {
model: string;
temperature: number;
maxTokens: number;
};
// Optimizations
useOptimizations: boolean;
}Related Packages
The Analysis Engine is part of the Wundr ecosystem:
- @wundr.io/cli - Command-line interface and project orchestration
- @wundr.io/governance - Governance framework and policy engine
- @wundr.io/drift-detection - Code quality drift monitoring
- @wundr.io/pattern-standardization - Pattern detection and auto-fixing
- @wundr.io/dependency-analyzer - Advanced dependency analysis
- @wundr.io/test-management - Test coverage and baseline tracking
- @wundr.io/monorepo-manager - Monorepo management utilities
API Reference
Core Classes
AnalysisEngine- Main orchestrator for all analysis operationsSimpleAnalyzer- Simplified analysis interfaceASTParserEngine- TypeScript/JavaScript AST parsingDuplicateDetectionEngine- Standard duplicate detectionOptimizedDuplicateDetectionEngine- Memory-optimized duplicate detectionComplexityMetricsEngine- Complexity analysisCircularDependencyEngine- Circular dependency detectionCodeSmellEngine- Code smell identificationUnusedExportEngine- Unused export detection
Performance Components
WorkerPoolManager- Concurrent task executionStreamingFileProcessor- Memory-efficient file processingMemoryMonitor- Memory tracking and leak detectionPerformanceBenchmarkSuite- Benchmarking utilities
Utilities
generateNormalizedHash- Create normalized code hashesgenerateSemanticHash- Generate semantic similarity hashescreateId- Generate unique identifiersprocessConcurrently- Concurrent processing helper
Examples
Example 1: Full Codebase Analysis
import { analyzeProject } from '@wundr.io/analysis-engine';
async function analyzeCodebase() {
const report = await analyzeProject('/path/to/project', {
outputFormats: ['json', 'html'],
performance: {
maxConcurrency: 30,
enableStreaming: true,
},
});
console.log(`\nAnalysis Summary:`);
console.log(`Total Files: ${report.summary.totalFiles}`);
console.log(`Total Entities: ${report.summary.totalEntities}`);
console.log(`Duplicate Clusters: ${report.duplicates.clusters.length}`);
console.log(`Circular Dependencies: ${report.circularDependencies.length}`);
console.log(`Code Smells: ${report.codeSmells.length}`);
console.log(`Unused Exports: ${report.unusedExports.length}`);
console.log(`Average Complexity: ${report.complexity.averageCyclomaticComplexity}`);
console.log(`Technical Debt: ${report.complexity.totalTechnicalDebt} hours`);
}Example 2: Targeted Complexity Analysis
import { AnalysisEngine, ComplexityMetricsEngine } from '@wundr.io/analysis-engine';
async function findComplexFunctions() {
const engine = new AnalysisEngine({
targetDir: './src',
exclude: ['**/*.spec.ts'],
});
const report = await engine.analyze();
const complexFunctions = report.complexity.complexityHotspots
.filter(h => h.complexity.cyclomatic > 20)
.sort((a, b) => b.rank - a.rank);
console.log(`\nTop 10 Most Complex Functions:`);
complexFunctions.slice(0, 10).forEach((hotspot, i) => {
console.log(`\n${i + 1}. ${hotspot.entity.name}`);
console.log(` File: ${hotspot.entity.file}:${hotspot.entity.line}`);
console.log(` Cyclomatic: ${hotspot.complexity.cyclomatic}`);
console.log(` Cognitive: ${hotspot.complexity.cognitive}`);
console.log(` Maintainability: ${hotspot.complexity.maintainability}`);
});
}Example 3: Duplicate Code Cleanup
import { OptimizedDuplicateDetectionEngine } from '@wundr.io/analysis-engine';
async function findDuplicates() {
const engine = new OptimizedDuplicateDetectionEngine({
minSimilarity: 0.85,
enableSemanticAnalysis: true
});
const entities = /* ... get entities from AST parser ... */;
const clusters = await engine.analyze(entities, config);
console.log(`\nFound ${clusters.length} duplicate clusters\n`);
clusters
.filter(c => c.severity === 'critical' || c.severity === 'high')
.forEach(cluster => {
console.log(`\nCluster: ${cluster.type} (${(cluster.similarity * 100).toFixed(1)}% similar)`);
console.log(`Severity: ${cluster.severity}`);
console.log(`Instances:`);
cluster.entities.forEach(e => {
console.log(` - ${e.file}:${e.line} (${e.name})`);
});
if (cluster.consolidationSuggestion) {
const suggestion = cluster.consolidationSuggestion;
console.log(`\nRecommendation: ${suggestion.strategy}`);
console.log(`Target: ${suggestion.targetFile}`);
console.log(`Effort: ${suggestion.estimatedEffort}`);
console.log(`Impact: ${suggestion.impact}`);
console.log(`Steps:`);
suggestion.steps.forEach(step => console.log(` ${step}`));
}
});
}Contributing
We welcome contributions! Please see our Contributing Guide for details.
License
MIT © Adaptic.ai
Support
- Documentation: https://wundr.io/docs
- GitHub Issues: https://github.com/adapticai/wundr/issues
- Discord: https://discord.gg/wundr
Built with excellence by the Wundr team at Adaptic.ai
