plan-viz

v0.1.14

Published

7 months ago

Convert Apache Data Fusion Physical Execution Plans to Excalidraw JSON format

0High
0Medium
0Low

nga-tran

apache datafusion data-fusion execution-plan excalidraw visualization query-plan sql diagram arrow apache-arrow

plan-viz

Convert Apache DataFusion physical execution plans into Excalidraw‑JSON for clear visualization and deeper understanding. The generated diagrams use color coding to highlight key properties and propagate them throughout the plan, making it easy to see how many streams or partitions run in parallel at each operator and whether sort order is preserved and leveraged. The visualization also pinpoints operators where parallelism or sort order is lost, helping you quickly identify bottlenecks and guide performance improvements.

Additional advantages of using Excalidraw:

Edit or extend the graphical plan directly in Excalidraw, then re‑save it back to JSON
Export the visualization to PNG or SVG for sharing in documentation, presentations, or reports
Collaborate interactively: Excalidraw supports real‑time editing, so teams can review and annotate plans together
Layer annotations or styling to emphasize specific operators, execution paths, or performance concerns
Maintain a living artifact: Excalidraw files can serve as both a visualization and editable source of truth

Repository: GitHub | Issues: Report a bug

Example

Input:

An indented EXPLAIN. You may also provide SQL or just the plan; the function automatically extracts physical operators.

| physical_plan | SortExec: expr=[env@0 ASC NULLS LAST, time_bin@1 ASC NULLS LAST], preserve_partitioning=[false]                                                                                                                                                                                                                                                                                                                                                                                            |
|               |   AggregateExec: mode=Single, gby=[env@1 as env, time_bin@0 as time_bin], aggr=[avg(a.max_bin_val)]                                                                                                                                                                                                                                                                                                                                                                                        |
|               |     ProjectionExec: expr=[date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }"),j.timestamp)@1 as time_bin, max(j.env)@2 as env, max(j.value)@3 as max_bin_val]                                                                                                                                                                                                                                                                           |
|               |       AggregateExec: mode=Final, gby=[f_dkey@0 as f_dkey, date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }"),j.timestamp)@1 as date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }"),j.timestamp)], aggr=[max(j.env), max(j.value)], ordering_mode=Sorted                                                                                                                             |
|               |         SortPreservingMergeExec: [f_dkey@0 ASC NULLS LAST, date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }"),j.timestamp)@1 ASC NULLS LAST]                                                                                                                                                                                                                                                                                           |
|               |           AggregateExec: mode=Partial, gby=[f_dkey@0 as f_dkey, date_bin(IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }, timestamp@1) as date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }"),j.timestamp)], aggr=[max(j.env), max(j.value)], ordering_mode=Sorted                                                                                                                                                |
|               |             ProjectionExec: expr=[f_dkey@1 as f_dkey, timestamp@2 as timestamp, env@0 as env, value@3 as value]                                                                                                                                                                                                                                                                                                                                                                            |
|               |               CoalesceBatchesExec: target_batch_size=8192                                                                                                                                                                                                                                                                                                                                                                                                                                  |
|               |                 HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(d_dkey@0, f_dkey@0)], projection=[env@1, f_dkey@2, timestamp@3, value@4]                                                                                                                                                                                                                                                                                                                                             |
|               |                   AggregateExec: mode=Final, gby=[d_dkey@0 as d_dkey, env@1 as env], aggr=[]                                                                                                                                                                                                                                                                                                                                                                                               |
|               |                     CoalescePartitionsExec                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
|               |                       AggregateExec: mode=Partial, gby=[d_dkey@0 as d_dkey, env@1 as env], aggr=[]                                                                                                                                                                                                                                                                                                                                                                                         |
|               |                         CoalesceBatchesExec: target_batch_size=8192                                                                                                                                                                                                                                                                                                                                                                                                                        |
|               |                           FilterExec: service@2 = log, projection=[d_dkey@0, env@1]                                                                                                                                                                                                                                                                                                                                                                                                        |
|               |                             DataSourceExec: file_groups={2 groups: [[d1.parquet], [d2.parquet]]}, projection=[d_dkey, env, service], file_type=parquet, predicate=service@2 = log, pruning_predicate=service_null_count@2 != row_count@3 AND service_min@0 <= log AND log <= service_max@1, required_guarantees=[service in (log)] |
|               |                   DataSourceExec: file_groups={3 groups: [[f1.parquet, f4.parquet, f5.parquet, f6.parquet, f7.parquet, f8.parquet], [f2.parquet], [f3.parquet, f9.parquet, f10.parquet]]}, projection=[f_dkey, timestamp, value], output_ordering=[f_dkey@0 ASC NULLS LAST, timestamp@1 ASC NULLS LAST], file_type=parquet, predicate=DynamicFilter [ empty ]             |

Output

A JSON‑formatted plan that, when opened in Excalidraw, visualizes the plan as a tree.

Arrows between nodes represent streams or partitions of data. Multiple arrows indicate that data is being streamed in parallel and independently.
Blue highlights denote attributes that are sorted.

Execution Plan Visualization

Output Analysis

This section adds pink annotations to clarify the roles of arrows, circles/ellipses, and color coding. Plans are read bottom-up, and the visualization clearly shows what each operator does, how parallel execution is maintained, and how sort order is preserved and leveraged.

Execution Plan Visualization with Analyses

Prerequisites

Node.js 20+ (LTS)
npm or yarn

Installation

For Development

git clone https://github.com/NGA-TRAN/plan_viz.git
cd plan_viz
npm install
npm run build

As a Package (published at https://www.npmjs.com/package/plan-viz)

npm install plan-viz

Usage

As a Library

import { convertPlanToExcalidraw } from 'plan-viz';

const executionPlan = `
ProjectionExec: expr=[id, name, age]
  FilterExec: age > 18
    DataSourceExec: file_groups={1 groups: [[data.parquet]]}
`;

const excalidrawJson = convertPlanToExcalidraw(executionPlan);
console.log(JSON.stringify(excalidrawJson, null, 2));

With Custom Configuration:

import { convertPlanToExcalidraw } from 'plan-viz';

const excalidrawJson = convertPlanToExcalidraw(executionPlan, {
  generator: {
    nodeWidth: 250,
    nodeHeight: 100,
    verticalSpacing: 120,
    horizontalSpacing: 60,
    operatorFontSize: 20,
    detailsFontSize: 16,
    nodeColor: '#64748b',
    arrowColor: '#64748b',
  },
});

As a CLI

After package installation npm install plan-viz:

# Basic usage: from file
plan-viz -i tests/join.sql -o output.excalidraw

# From stdin
cat tests/join.sql | plan-viz > output.excalidraw

# With custom dimensions and spacing
plan-viz -i tests/join.sql -o output.excalidraw \
  --node-width 250 \
  --node-height 100 \
  --vertical-spacing 120 \
  --horizontal-spacing 60

Building from Source Code:

# After building the project
npm run build

# Basic usage: from file
node dist/cli.js -i tests/join.sql -o output.excalidraw

# From stdin
cat tests/join.sql | node dist/cli.js > output.excalidraw

# With custom dimensions and spacing
node dist/cli.js -i tests/join.sql -o output.excalidraw \
  --node-width 250 \
  --node-height 100 \
  --vertical-spacing 120 \
  --horizontal-spacing 60

CLI Options:

-i, --input <file> - Input file containing the execution plan
-o, --output <file> - Output file for Excalidraw JSON
--node-width <number> - Width of each node box (default: 200)
--node-height <number> - Height of each node box (default: 80)
--vertical-spacing <number> - Vertical spacing between nodes (default: 100)
--horizontal-spacing <number> - Horizontal spacing between sibling nodes (default: 50)

Viewing (and Editing) the Output

Option 1: Use the UI App plan-visualizer, customized for this library

Enter your EXPLAIN output in the input panel
Click the Visualize button to see the graphical output

Option 2: Use Online Excalidraw

Go to Excalidraw
Click "Open" in the top menu
Upload your generated .excalidraw file
View your execution plan diagram!

Option 3: Directly in your IDE

If you use an IDE such as VSCode or Cursor, you can install the Excalidraw extension and view the diagram directly in your IDE:

Install the Excalidraw extension from the marketplace (e.g., "Excalidraw" by pomdtr)
Open your generated .excalidraw file in your IDE
The diagram will render automatically in the editor

Note: The tests/ directory serves a dual purpose:
Test fixtures: .sql files containing physical plans (and their SQL queries) with expected outputs in [tests/expected/] for integration tests
Examples: .sql files containing physical plans you can run directly with the CLI (plan-viz -i tests/join.sql -o output.excalidraw) or use as input to the UI App in Option 1
See the tests/ directory for sample execution plans. Expected JSON‑formatted outputs are available in tests/expected/ and can be opened using Option 2 or Option 3.

API

`convertPlanToExcalidraw(plan: string, config?: ConverterConfig): ExcalidrawData`

Converts an Apache DataFusion physical execution plan into Excalidraw-compatible JSON for visualization.

Parameters:

plan - The physical execution plan as a string
config - Optional configuration object (see below)

Returns:

ExcalidrawData - Excalidraw-compatible JSON object

Throws:

Error - If the plan text is invalid or empty

Configuration Options:

interface ConverterConfig {
  parser?: {
    indentationSize?: number;      // Default: 2
    extractProperties?: boolean;    // Default: true
  };
  generator?: {
    nodeWidth?: number;               // Default: 200
    nodeHeight?: number;              // Default: 80
    verticalSpacing?: number;         // Default: 100
    horizontalSpacing?: number;       // Default: 50
    operatorFontSize?: number;        // Default: 18 (for operator name)
    detailsFontSize?: number;         // Default: 14 (for properties/details)
    nodeColor?: string;               // Default: '#1971c2'
    arrowColor?: string;              // Default: '#495057'
  };
}

Example:

import { convertPlanToExcalidraw } from 'plan-viz';

const plan = `
ProjectionExec: expr=[id, name, age]
  FilterExec: age > 18
    DataSourceExec: file_groups={1 groups: [[data.parquet]]}
`;

const result = convertPlanToExcalidraw(plan, {
  generator: {
    nodeWidth: 250,
    nodeHeight: 100,
    nodeColor: '#64748b',
  },
});

Examples

The project includes numerous example execution plans in the tests/ directory, including:

Data source plans (dataSource*.sql)
Filter operations (filter*.sql)
Repartitioning (repartition*.sql)
Aggregation examples (*aggregate*.sql)
Projection (*projection*.sql)
Join operations (join*.sql, join_sort_merge*.sql)
Sorting (sort*.sql)
Union (union*.sql)
And many more!

Note: The tests/ directory serves a dual purpose: test fixtures and examples

Each example includes:

A .sql file with the execution plan (in tests/)
A .excalidraw file showing the expected Excalidraw-compatible JSON for visualization (in tests/expected/)

Try them out:

# Convert an example (after building)
npm run build
node dist/cli.js -i tests/join.sql -o output.excalidraw

# Or use the usage example script
npx ts-node tests/usage-example.ts

Development

# Install dependencies
npm install

# Build
npm run build

# Run tests
npm test

# Run tests with coverage
npm test:coverage

# Lint
npm run lint

# Format
npm run format

Project Structure

See PROJECT_STRUCTURE.md for detailed project organization.

Architecture

The project follows Clean Code principles and SOLID design patterns. See ARCHITECTURE.md for comprehensive architecture documentation.

Key Highlights:

SOLID Principles: Single Responsibility, Open/Closed, Liskov Substitution, Interface Segregation, Dependency Inversion
Design Patterns: Coordinator, Strategy, Factory, Builder, Renderer patterns
Components: ExecutionPlanParser, ExcalidrawGenerator (coordinator), ConverterService
Testability: >95% test coverage across all components

Testing

# Run all tests
npm test

# Watch mode
npm run test:watch

# Coverage report
npm run test:coverage

Current coverage: >95% (branches, functions, lines, statements)

Code Quality

Linting: ESLint with Google TypeScript style guide
Formatting: Prettier
Type Safety: Strict TypeScript configuration
Testing: Jest with comprehensive test suites

Contributing

We welcome contributions! Please see CONTRIBUTING.md for details.

# Use commitizen for commits
npm run commit

Commit Convention

This project uses Conventional Commits:

feat: - New features
fix: - Bug fixes
docs: - Documentation changes
test: - Test updates
refactor: - Code refactoring

Roadmap

[X] Interactive web interface: plan-visualizer — try it out!
[ ] Support for additional DataFusion operators (see MISSING_OPERATORS)
[ ] Enhanced options for custom styling
[ ] Performance optimizations for large plans

Resources

Apache DataFusion
Excalidraw API
Google TypeScript Style Guide
Quick Start Guide - Get started quickly
Project Overview - Detailed architecture and design

License

MIT - see LICENSE for details

Authors

Created with ❤️ for the Apache DataFusion community

Changelog

See CHANGELOG.md for a detailed history of changes.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

plan-viz

Example

Input:

Output

Output Analysis

Prerequisites

Installation

For Development

As a Package (published at https://www.npmjs.com/package/plan-viz)

Usage

As a Library

As a CLI

Viewing (and Editing) the Output

Option 1: Use the UI App plan-visualizer, customized for this library

Option 2: Use Online Excalidraw

Option 3: Directly in your IDE

API

convertPlanToExcalidraw(plan: string, config?: ConverterConfig): ExcalidrawData

Examples

Development

Project Structure

Architecture

Testing

Code Quality

Contributing

Commit Convention

Roadmap

Resources

License

Authors

Changelog

`convertPlanToExcalidraw(plan: string, config?: ConverterConfig): ExcalidrawData`