plan-viz
v0.1.14
Published
Convert Apache Data Fusion Physical Execution Plans to Excalidraw JSON format
Downloads
341
Maintainers
Readme
plan-viz
Convert Apache DataFusion physical execution plans into Excalidraw‑JSON for clear visualization and deeper understanding. The generated diagrams use color coding to highlight key properties and propagate them throughout the plan, making it easy to see how many streams or partitions run in parallel at each operator and whether sort order is preserved and leveraged. The visualization also pinpoints operators where parallelism or sort order is lost, helping you quickly identify bottlenecks and guide performance improvements.
Additional advantages of using Excalidraw:
- Edit or extend the graphical plan directly in Excalidraw, then re‑save it back to JSON
- Export the visualization to PNG or SVG for sharing in documentation, presentations, or reports
- Collaborate interactively: Excalidraw supports real‑time editing, so teams can review and annotate plans together
- Layer annotations or styling to emphasize specific operators, execution paths, or performance concerns
- Maintain a living artifact: Excalidraw files can serve as both a visualization and editable source of truth
Repository: GitHub | Issues: Report a bug
Example
Input:
An indented EXPLAIN. You may also provide SQL or just the plan; the function automatically extracts physical operators.
| physical_plan | SortExec: expr=[env@0 ASC NULLS LAST, time_bin@1 ASC NULLS LAST], preserve_partitioning=[false] |
| | AggregateExec: mode=Single, gby=[env@1 as env, time_bin@0 as time_bin], aggr=[avg(a.max_bin_val)] |
| | ProjectionExec: expr=[date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }"),j.timestamp)@1 as time_bin, max(j.env)@2 as env, max(j.value)@3 as max_bin_val] |
| | AggregateExec: mode=Final, gby=[f_dkey@0 as f_dkey, date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }"),j.timestamp)@1 as date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }"),j.timestamp)], aggr=[max(j.env), max(j.value)], ordering_mode=Sorted |
| | SortPreservingMergeExec: [f_dkey@0 ASC NULLS LAST, date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }"),j.timestamp)@1 ASC NULLS LAST] |
| | AggregateExec: mode=Partial, gby=[f_dkey@0 as f_dkey, date_bin(IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }, timestamp@1) as date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 30000000000 }"),j.timestamp)], aggr=[max(j.env), max(j.value)], ordering_mode=Sorted |
| | ProjectionExec: expr=[f_dkey@1 as f_dkey, timestamp@2 as timestamp, env@0 as env, value@3 as value] |
| | CoalesceBatchesExec: target_batch_size=8192 |
| | HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(d_dkey@0, f_dkey@0)], projection=[env@1, f_dkey@2, timestamp@3, value@4] |
| | AggregateExec: mode=Final, gby=[d_dkey@0 as d_dkey, env@1 as env], aggr=[] |
| | CoalescePartitionsExec |
| | AggregateExec: mode=Partial, gby=[d_dkey@0 as d_dkey, env@1 as env], aggr=[] |
| | CoalesceBatchesExec: target_batch_size=8192 |
| | FilterExec: service@2 = log, projection=[d_dkey@0, env@1] |
| | DataSourceExec: file_groups={2 groups: [[d1.parquet], [d2.parquet]]}, projection=[d_dkey, env, service], file_type=parquet, predicate=service@2 = log, pruning_predicate=service_null_count@2 != row_count@3 AND service_min@0 <= log AND log <= service_max@1, required_guarantees=[service in (log)] |
| | DataSourceExec: file_groups={3 groups: [[f1.parquet, f4.parquet, f5.parquet, f6.parquet, f7.parquet, f8.parquet], [f2.parquet], [f3.parquet, f9.parquet, f10.parquet]]}, projection=[f_dkey, timestamp, value], output_ordering=[f_dkey@0 ASC NULLS LAST, timestamp@1 ASC NULLS LAST], file_type=parquet, predicate=DynamicFilter [ empty ] |Output
A JSON‑formatted plan that, when opened in Excalidraw, visualizes the plan as a tree.
- Arrows between nodes represent streams or partitions of data. Multiple arrows indicate that data is being streamed in parallel and independently.
- Blue highlights denote attributes that are sorted.

Output Analysis
This section adds pink annotations to clarify the roles of arrows, circles/ellipses, and color coding. Plans are read bottom-up, and the visualization clearly shows what each operator does, how parallel execution is maintained, and how sort order is preserved and leveraged.

Prerequisites
- Node.js 20+ (LTS)
- npm or yarn
Installation
For Development
git clone https://github.com/NGA-TRAN/plan_viz.git
cd plan_viz
npm install
npm run buildAs a Package (published at https://www.npmjs.com/package/plan-viz)
npm install plan-vizUsage
As a Library
import { convertPlanToExcalidraw } from 'plan-viz';
const executionPlan = `
ProjectionExec: expr=[id, name, age]
FilterExec: age > 18
DataSourceExec: file_groups={1 groups: [[data.parquet]]}
`;
const excalidrawJson = convertPlanToExcalidraw(executionPlan);
console.log(JSON.stringify(excalidrawJson, null, 2));With Custom Configuration:
import { convertPlanToExcalidraw } from 'plan-viz';
const excalidrawJson = convertPlanToExcalidraw(executionPlan, {
generator: {
nodeWidth: 250,
nodeHeight: 100,
verticalSpacing: 120,
horizontalSpacing: 60,
operatorFontSize: 20,
detailsFontSize: 16,
nodeColor: '#64748b',
arrowColor: '#64748b',
},
});As a CLI
After package installation npm install plan-viz:
# Basic usage: from file
plan-viz -i tests/join.sql -o output.excalidraw
# From stdin
cat tests/join.sql | plan-viz > output.excalidraw
# With custom dimensions and spacing
plan-viz -i tests/join.sql -o output.excalidraw \
--node-width 250 \
--node-height 100 \
--vertical-spacing 120 \
--horizontal-spacing 60Building from Source Code:
# After building the project
npm run build
# Basic usage: from file
node dist/cli.js -i tests/join.sql -o output.excalidraw
# From stdin
cat tests/join.sql | node dist/cli.js > output.excalidraw
# With custom dimensions and spacing
node dist/cli.js -i tests/join.sql -o output.excalidraw \
--node-width 250 \
--node-height 100 \
--vertical-spacing 120 \
--horizontal-spacing 60CLI Options:
-i, --input <file>- Input file containing the execution plan-o, --output <file>- Output file for Excalidraw JSON--node-width <number>- Width of each node box (default: 200)--node-height <number>- Height of each node box (default: 80)--vertical-spacing <number>- Vertical spacing between nodes (default: 100)--horizontal-spacing <number>- Horizontal spacing between sibling nodes (default: 50)
Viewing (and Editing) the Output
Option 1: Use the UI App plan-visualizer, customized for this library
- Enter your
EXPLAINoutput in the input panel - Click the Visualize button to see the graphical output
Option 2: Use Online Excalidraw
- Go to Excalidraw
- Click "Open" in the top menu
- Upload your generated
.excalidrawfile - View your execution plan diagram!
Option 3: Directly in your IDE
If you use an IDE such as VSCode or Cursor, you can install the Excalidraw extension and view the diagram directly in your IDE:
- Install the Excalidraw extension from the marketplace (e.g., "Excalidraw" by pomdtr)
- Open your generated
.excalidrawfile in your IDE - The diagram will render automatically in the editor
Note: The
tests/directory serves a dual purpose:
- Test fixtures:
.sqlfiles containing physical plans (and their SQL queries) with expected outputs in [tests/expected/] for integration tests- Examples:
.sqlfiles containing physical plans you can run directly with the CLI (plan-viz -i tests/join.sql -o output.excalidraw) or use as input to the UI App in Option 1See the
tests/directory for sample execution plans. Expected JSON‑formatted outputs are available intests/expected/and can be opened using Option 2 or Option 3.
API
convertPlanToExcalidraw(plan: string, config?: ConverterConfig): ExcalidrawData
Converts an Apache DataFusion physical execution plan into Excalidraw-compatible JSON for visualization.
Parameters:
plan- The physical execution plan as a stringconfig- Optional configuration object (see below)
Returns:
ExcalidrawData- Excalidraw-compatible JSON object
Throws:
Error- If the plan text is invalid or empty
Configuration Options:
interface ConverterConfig {
parser?: {
indentationSize?: number; // Default: 2
extractProperties?: boolean; // Default: true
};
generator?: {
nodeWidth?: number; // Default: 200
nodeHeight?: number; // Default: 80
verticalSpacing?: number; // Default: 100
horizontalSpacing?: number; // Default: 50
operatorFontSize?: number; // Default: 18 (for operator name)
detailsFontSize?: number; // Default: 14 (for properties/details)
nodeColor?: string; // Default: '#1971c2'
arrowColor?: string; // Default: '#495057'
};
}Example:
import { convertPlanToExcalidraw } from 'plan-viz';
const plan = `
ProjectionExec: expr=[id, name, age]
FilterExec: age > 18
DataSourceExec: file_groups={1 groups: [[data.parquet]]}
`;
const result = convertPlanToExcalidraw(plan, {
generator: {
nodeWidth: 250,
nodeHeight: 100,
nodeColor: '#64748b',
},
});Examples
The project includes numerous example execution plans in the tests/ directory, including:
- Data source plans (
dataSource*.sql) - Filter operations (
filter*.sql) - Repartitioning (
repartition*.sql) - Aggregation examples (
*aggregate*.sql) - Projection (
*projection*.sql) - Join operations (
join*.sql,join_sort_merge*.sql) - Sorting (
sort*.sql) - Union (
union*.sql) - And many more!
Note: The
tests/directory serves a dual purpose: test fixtures and examples
Each example includes:
- A
.sqlfile with the execution plan (intests/) - A
.excalidrawfile showing the expected Excalidraw-compatible JSON for visualization (intests/expected/)
Try them out:
# Convert an example (after building)
npm run build
node dist/cli.js -i tests/join.sql -o output.excalidraw
# Or use the usage example script
npx ts-node tests/usage-example.tsDevelopment
# Install dependencies
npm install
# Build
npm run build
# Run tests
npm test
# Run tests with coverage
npm test:coverage
# Lint
npm run lint
# Format
npm run formatProject Structure
See PROJECT_STRUCTURE.md for detailed project organization.
Architecture
The project follows Clean Code principles and SOLID design patterns. See ARCHITECTURE.md for comprehensive architecture documentation.
Key Highlights:
- SOLID Principles: Single Responsibility, Open/Closed, Liskov Substitution, Interface Segregation, Dependency Inversion
- Design Patterns: Coordinator, Strategy, Factory, Builder, Renderer patterns
- Components: ExecutionPlanParser, ExcalidrawGenerator (coordinator), ConverterService
- Testability: >95% test coverage across all components
Testing
# Run all tests
npm test
# Watch mode
npm run test:watch
# Coverage report
npm run test:coverageCurrent coverage: >95% (branches, functions, lines, statements)
Code Quality
- Linting: ESLint with Google TypeScript style guide
- Formatting: Prettier
- Type Safety: Strict TypeScript configuration
- Testing: Jest with comprehensive test suites
Contributing
We welcome contributions! Please see CONTRIBUTING.md for details.
# Use commitizen for commits
npm run commitCommit Convention
This project uses Conventional Commits:
feat:- New featuresfix:- Bug fixesdocs:- Documentation changestest:- Test updatesrefactor:- Code refactoring
Roadmap
- [X] Interactive web interface: plan-visualizer — try it out!
- [ ] Support for additional DataFusion operators (see MISSING_OPERATORS)
- [ ] Enhanced options for custom styling
- [ ] Performance optimizations for large plans
Resources
- Apache DataFusion
- Excalidraw API
- Google TypeScript Style Guide
- Quick Start Guide - Get started quickly
- Project Overview - Detailed architecture and design
License
MIT - see LICENSE for details
Authors
Created with ❤️ for the Apache DataFusion community
Changelog
See CHANGELOG.md for a detailed history of changes.
