@seedts/performance

v0.1.1

Published

5 months ago

Performance optimization utilities for SeedTS - Batch inserts, streaming, progress reporting, and benchmarking

0High
0Medium
0Low

dariobr91

seedts seeding database typescript performance optimization batch streaming benchmark

@seedts/performance

Performance optimization utilities for SeedTS - Batch processing, streaming, progress reporting, and benchmarking for large-scale seeding operations.

Installation

npm install @seedts/performance
# or
pnpm add @seedts/performance
# or
yarn add @seedts/performance

Features

⚡ Batch Processing - Process records in configurable batches
📊 Progress Reporting - Track and display seeding progress
🌊 Streaming - Memory-efficient processing for large datasets
📈 Benchmarking - Measure and compare performance
🔍 Profiling - Detailed performance analysis
💾 Memory Management - Monitor and optimize memory usage

Quick Start

Batch Processing

import { processBatches, createBatchProcessor } from '@seedts/performance';

// Process 10,000 records in batches of 1,000
const results = await processBatches(
  users,
  async (batch) => {
    return await adapter.insert('users', batch);
  },
  {
    batchSize: 1000,
    parallel: true,
    maxParallel: 5,
    onProgress: (progress) => {
      console.log(`Progress: ${progress.percentage}%`);
    }
  }
);

Progress Reporting

import { createConsoleReporter } from '@seedts/performance';

const reporter = createConsoleReporter({ format: 'detailed' });

await processBatches(users, batchInsert, {
  batchSize: 1000,
  onProgress: reporter
});

Benchmarking

import { createBenchmarkSuite } from '@seedts/performance';

const suite = createBenchmarkSuite('Seeding Performance')
  .add('Sequential', async () => {
    await seedSequentially(users);
  })
  .add('Batched', async () => {
    await processBatches(users, batchInsert, { batchSize: 1000 });
  });

await suite.run(10);
suite.printResults();

Batch Processing

processBatches()

Process data in batches with configurable options.

const results = await processBatches(
  data,
  async (batch) => {
    return await adapter.insert('table', batch);
  },
  {
    batchSize: 1000,           // Records per batch
    parallel: true,             // Process batches in parallel
    maxParallel: 5,             // Max concurrent batches
    delayBetweenBatches: 100,   // Delay in ms
    onBatchComplete: (batchNum, count) => {
      console.log(`Batch ${batchNum}: ${count} records`);
    },
    onProgress: (progress) => {
      console.log(`${progress.percentage}%`);
    }
  }
);

createBatchProcessor()

Create a reusable batch processor for an adapter.

const batchInsert = createBatchProcessor(adapter, 'users');
const results = await processBatches(users, batchInsert, { batchSize: 1000 });

calculateOptimalBatchSize()

Calculate optimal batch size based on record size.

const avgRecordSize = 500; // bytes
const batchSize = calculateOptimalBatchSize(avgRecordSize, 10); // 10MB per batch
// Returns optimal size between 100 and 10,000

processBatchesWithRetry()

Process batches with automatic retry on failure.

const results = await processBatchesWithRetry(
  data,
  processor,
  { batchSize: 1000 },
  3 // Max 3 retries with exponential backoff
);

Progress Reporting

ProgressBar

Terminal progress bar with ETA and rate display.

import { ProgressBar } from '@seedts/performance';

const bar = new ProgressBar(totalRecords, 40);

for (let i = 0; i < totalRecords; i++) {
  // Process record
  bar.tick();
}

createConsoleReporter()

Create a progress reporter for console output.

const reporter = createConsoleReporter({
  showETA: true,
  showRate: true,
  format: 'detailed' // or 'simple'
});

await processBatches(data, processor, { onProgress: reporter });

MultiProgressTracker

Track progress of multiple concurrent operations.

import { MultiProgressTracker } from '@seedts/performance';

const tracker = new MultiProgressTracker();

tracker.register('users', 10000);
tracker.register('posts', 50000);

// Update progress
tracker.update('users', { recordsProcessed: 5000, percentage: 50 });
tracker.update('posts', { recordsProcessed: 25000, percentage: 50 });

tracker.stop();

ProgressEmitter

Event-based progress tracking.

import { ProgressEmitter } from '@seedts/performance';

const emitter = new ProgressEmitter();

emitter.on('progress', (data) => {
  console.log(`Progress: ${data.percentage}%`);
});

emitter.on('complete', (data) => {
  console.log('Completed!');
});

emitter.emit('progress', { percentage: 50 });

Streaming

processStream()

Process data as a stream for memory efficiency.

import { processStream } from '@seedts/performance';

const results = await processStream(
  largeDataset,
  async (chunk) => {
    return await adapter.insert('table', chunk);
  },
  {
    chunkSize: 100,
    highWaterMark: 16
  }
);

generateChunks()

Generate chunks for async iteration.

import { generateChunks } from '@seedts/performance';

for await (const chunk of generateChunks(data, 100)) {
  await adapter.insert('users', chunk);
}

createStreamProcessor()

Create a streaming processor for an adapter.

import { createStreamProcessor } from '@seedts/performance';

const processor = createStreamProcessor(adapter, 'users', {
  chunkSize: 100
});

await processor.process(users, (recordsProcessed) => {
  console.log(`Processed: ${recordsProcessed}`);
});

createBackPressureStream()

Handle back pressure with concurrency limits.

import { createBackPressureStream } from '@seedts/performance';

const stream = createBackPressureStream(
  async (item) => {
    return await processItem(item);
  },
  5 // Max 5 concurrent operations
);

Benchmarking

benchmark()

Benchmark a single function.

import { benchmark } from '@seedts/performance';

const result = await benchmark('insert-users', async () => {
  await adapter.insert('users', users);
}, 10); // 10 iterations

console.log(`${result.operationsPerSecond.toFixed(2)} ops/sec`);
console.log(`Memory: ${formatMemorySize(result.memoryDelta)}`);

compareBenchmarks()

Compare multiple approaches.

import { compareBenchmarks, printBenchmarkComparison } from '@seedts/performance';

const results = await compareBenchmarks([
  { name: 'Sequential', fn: async () => await seedSequential() },
  { name: 'Parallel', fn: async () => await seedParallel() },
  { name: 'Batched', fn: async () => await seedBatched() }
], 10);

printBenchmarkComparison(results);

BenchmarkSuite

Create a benchmark suite.

import { createBenchmarkSuite } from '@seedts/performance';

const suite = createBenchmarkSuite('Database Operations')
  .add('Insert 1000', async () => {
    await adapter.insert('users', generateUsers(1000));
  })
  .add('Insert 10000', async () => {
    await adapter.insert('users', generateUsers(10000));
  })
  .add('Batch Insert 10000', async () => {
    await processBatches(generateUsers(10000), batchInsert, { batchSize: 1000 });
  });

await suite.run(5); // 5 iterations per benchmark
suite.printResults();

Output:

====================================================================================================
Benchmark Results
====================================================================================================
Name                           | Operations  | Duration    | Ops/sec        | Avg Time       | Memory
----------------------------------------------------------------------------------------------------
Insert 1000                    | 5           | 523.45ms    | 9.55           | 104.69ms       | 2.34 MB
Insert 10000                   | 5           | 5234.12ms   | 0.96           | 1046.82ms      | 23.45 MB
Batch Insert 10000             | 5           | 2456.78ms   | 2.04           | 491.36ms       | 15.67 MB
====================================================================================================

Comparison:
  Insert 1000: Fastest
  Insert 10000: 9.96x slower (896.00% slower)
  Batch Insert 10000: 4.68x slower (368.00% slower)

measureTime() / measureMemory() / measure()

Quick measurements.

import { measureTime, measureMemory, measure } from '@seedts/performance';

const duration = await measureTime(async () => {
  await seedUsers();
});

const memoryUsed = await measureMemory(async () => {
  await seedUsers();
});

const { duration, memory } = await measure(async () => {
  await seedUsers();
});

Profiling

Profiler

Track performance of multiple operations.

import { profiler } from '@seedts/performance';

profiler.start('generate-data');
const data = generateData();
profiler.end('generate-data');

profiler.start('insert-data');
await adapter.insert('users', data);
profiler.end('insert-data');

profiler.printResults();

profileFunction()

Profile a single function.

import { profileFunction } from '@seedts/performance';

const { result, profile } = await profileFunction('seed-users', async () => {
  return await executor.execute();
}, { metadata: { count: 10000 } });

console.log(`Duration: ${profile.duration}ms`);
console.log(`Memory: ${formatMemorySize(profile.memoryDelta)}`);

createProfilingSession()

Create an isolated profiling session.

import { createProfilingSession } from '@seedts/performance';

const session = createProfilingSession();

await session.profile('task-1', async () => {
  // Task 1
});

await session.profile('task-2', async () => {
  // Task 2
});

const metrics = session.getMetrics();
session.printResults();

MemoryMonitor

Monitor memory usage over time.

import { createMemoryMonitor } from '@seedts/performance';

const monitor = createMemoryMonitor();
monitor.start(1000); // Sample every 1 second

// Run operations
await seedUsers();

monitor.stop();
monitor.printReport();

Complete Examples

Example 1: Large Dataset with Progress

import {
  processBatches,
  createConsoleReporter,
  ProgressBar
} from '@seedts/performance';

async function seedLargeDataset() {
  const users = generateUsers(100000);

  const reporter = createConsoleReporter({
    format: 'detailed',
    showETA: true,
    showRate: true
  });

  const results = await processBatches(
    users,
    async (batch) => {
      return await adapter.insert('users', batch);
    },
    {
      batchSize: 1000,
      parallel: true,
      maxParallel: 5,
      onProgress: reporter
    }
  );

  console.log(`Successfully seeded ${results.length} users`);
}

Example 2: Performance Comparison

import {
  createBenchmarkSuite,
  processBatches,
  processStream
} from '@seedts/performance';

async function compareApproaches() {
  const users = generateUsers(10000);

  const suite = createBenchmarkSuite('Seeding Strategies')
    .add('Sequential Insert', async () => {
      for (const user of users) {
        await adapter.insert('users', [user]);
      }
    })
    .add('Batch Insert (1000)', async () => {
      await processBatches(
        users,
        async (batch) => await adapter.insert('users', batch),
        { batchSize: 1000 }
      );
    })
    .add('Parallel Batches', async () => {
      await processBatches(
        users,
        async (batch) => await adapter.insert('users', batch),
        { batchSize: 1000, parallel: true, maxParallel: 5 }
      );
    })
    .add('Streaming', async () => {
      await processStream(
        users,
        async (chunk) => await adapter.insert('users', chunk),
        { chunkSize: 100 }
      );
    });

  await suite.run(3);
  suite.printResults();
}

Example 3: Memory-Efficient Processing

import {
  generateChunks,
  createMemoryMonitor
} from '@seedts/performance';

async function seedWithMemoryMonitoring() {
  const monitor = createMemoryMonitor();
  monitor.start(500); // Sample every 500ms

  let processedCount = 0;

  for await (const chunk of generateChunks(largeDataset, 100)) {
    await adapter.insert('records', chunk);
    processedCount += chunk.length;

    console.log(`Processed: ${processedCount}`);
  }

  monitor.stop();
  monitor.printReport();
}

Example 4: Profiled Seed Execution

import {
  createProfilingSession,
  formatDuration,
  formatMemorySize
} from '@seedts/performance';

async function profiledSeedExecution() {
  const session = createProfilingSession();

  await session.profile('generate-users', async () => {
    return generateUsers(10000);
  });

  await session.profile('insert-users', async () => {
    await processBatches(users, batchInsert, { batchSize: 1000 });
  });

  await session.profile('create-indexes', async () => {
    await adapter.query('CREATE INDEX idx_email ON users(email)');
  });

  session.printResults();

  const metrics = session.getMetrics();
  console.log(`\nTotal Duration: ${formatDuration(metrics.totalDuration)}`);
  console.log(`Peak Memory: ${formatMemorySize(metrics.peakMemory)}`);
  console.log(`Throughput: ${metrics.operationsPerSecond.toFixed(2)} ops/sec`);
}

Example 5: Streaming with Back Pressure

import {
  createStreamProcessor,
  ProgressBar
} from '@seedts/performance';

async function streamWithBackPressure() {
  const processor = createStreamProcessor(adapter, 'users', {
    chunkSize: 100,
    highWaterMark: 16
  });

  const bar = new ProgressBar(users.length);

  await processor.process(users, (recordsProcessed) => {
    bar.update(recordsProcessed);
  });

  bar.complete();
}

API Reference

Batch Processing

processBatches(data, processor, options) - Process data in batches
createBatchProcessor(adapter, tableName) - Create batch processor
calculateOptimalBatchSize(recordSize, maxMemory) - Calculate batch size
chunkArray(array, chunkSize) - Split array into chunks
processBatchesWithRetry(data, processor, options, maxRetries) - Batch with retry

Progress Reporting

ProgressBar(total, width) - Terminal progress bar
createConsoleReporter(options) - Console progress reporter
createSilentReporter() - No-op reporter
createCustomReporter(callback) - Custom reporter
formatDuration(ms) - Format duration
MultiProgressTracker - Multi-operation tracking
ProgressEmitter - Event-based progress

Streaming

processStream(data, processor, options) - Stream processing
generateChunks(data, chunkSize) - Async generator
createStreamProcessor(adapter, tableName, options) - Stream processor
createBackPressureStream(processor, maxConcurrency) - Back pressure handling

Benchmarking

benchmark(name, fn, iterations) - Single benchmark
compareBenchmarks(benchmarks, iterations) - Compare multiple
BenchmarkSuite - Benchmark suite class
createBenchmarkSuite(name) - Create suite
measureTime(fn) - Measure execution time
measureMemory(fn) - Measure memory usage
measure(fn) - Measure both

Profiling

Profiler - Profiler class
profiler - Global profiler instance
profileFunction(name, fn, metadata) - Profile function
createProfilingSession() - Profiling session
MemoryMonitor - Memory monitoring
createMemoryMonitor() - Create monitor

Utilities

getMemoryUsage() - Current memory stats
formatMemorySize(bytes) - Format memory size
calculateMetrics(profiles) - Calculate metrics

TypeScript Support

All functions are fully typed:

import type {
  BatchOptions,
  ProgressInfo,
  StreamOptions,
  BenchmarkResult,
  ProfilingResult,
  PerformanceMetrics
} from '@seedts/performance';

Best Practices

Choose the Right Batch Size
- Use calculateOptimalBatchSize() based on your record size
- Balance between memory usage and database round trips
- Typical range: 100-10,000 records per batch
Monitor Memory
- Use MemoryMonitor for long-running operations
- Check for memory leaks with profiling
- Consider streaming for very large datasets
Use Progress Reporting
- Provide feedback for long operations
- Help users understand performance characteristics
- Enable early cancellation if needed
Benchmark Before Optimizing
- Measure actual performance, don't guess
- Compare different approaches
- Profile to find bottlenecks
Handle Back Pressure
- Use streaming for memory efficiency
- Limit concurrent operations
- Consider throttling if needed

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@seedts/performance

Installation

Features

Quick Start

Batch Processing

Progress Reporting

Benchmarking

Batch Processing

processBatches()

createBatchProcessor()

calculateOptimalBatchSize()

processBatchesWithRetry()

Progress Reporting

ProgressBar

createConsoleReporter()

MultiProgressTracker

ProgressEmitter

Streaming

processStream()

generateChunks()

createStreamProcessor()

createBackPressureStream()

Benchmarking

benchmark()

compareBenchmarks()

BenchmarkSuite

measureTime() / measureMemory() / measure()

Profiling

Profiler

profileFunction()

createProfilingSession()

MemoryMonitor

Complete Examples

Example 1: Large Dataset with Progress

Example 2: Performance Comparison

Example 3: Memory-Efficient Processing

Example 4: Profiled Seed Execution

Example 5: Streaming with Back Pressure

API Reference

Batch Processing

Progress Reporting

Streaming

Benchmarking

Profiling

Utilities

TypeScript Support

Best Practices

License