Holoscan Flow Benchmarking for HoloHub #

Authors: Holoscan Team (NVIDIA)
Supported platforms: x86_64, aarch64
Last modified: May 13, 2025
Latest version: 0.1.0
Minimum Holoscan SDK version: 1.0.3
Tested Holoscan SDK versions: 1.0.3
Contribution metric: Level 3 - Developmental

Holoscan Flow Benchmarking is a comprehensive performance evaluation tool designed to measure and analyze the execution characteristics of HoloHub and Holoscan applications. It provides detailed insights into operator execution times, data transfer latencies, and overall application performance.

For detailed information, refer to:

Holoscan Flow Benchmarking Tutorial (up-to-date)
Holoscan Flow Benchmarking Whitepaper

Key Features:

Support for all Holoscan applications (Python support since v1.0)
Real-time performance monitoring
Detailed latency analysis
Visual performance graphs

Table of Contents#

Environment Setup
Holohub Docker Container (recommended)
Bare-metal Installation
Step-by-step Guide to Holoscan Flow Benchmarking
1. Build Applications
2. Run Benchmarks
3. Analyze Results
Generate Application Graph with Latency Numbers
Monitor Application Performance in Real-time

Environment Setup#

Holohub Docker Container (recommended)#

All dependencies are automatically managed by the Holohub Docker container. You can simply run:

./holohub run-container [<application_name>]

Bare-metal Installation#

If not using the Holohub Docker container, apart from the holoscan and application's specific dependencies, additional Python packages should be installed:

pip install -r benchmarks/holoscan_flow_benchmarking/requirements.txt

These python dependencies include:

numpy: Data processing
matplotlib: Graph generation
nvitop: GPU monitoring
argparse: CLI handling
pydot: Graph creation
xdot: Graph visualization

Note: xdot has additional system dependencies. See xdot requirements.

Step-by-step Guide to Holoscan Flow Benchmarking#

1. Build Applications#

Automatic Build with Benchmarking#

./holohub build <application_name> [options] --benchmark

This command:

Patches the application source
Builds with benchmarking enabled
Automatically restores source files after build

For example:

./holohub build endoscopy_tool_tracking --benchmark --language cpp

Manual Patching [only if need to]#

# Apply patches
./benchmarks/holoscan_flow_benchmarking/patch_application.sh <app_directory>

# Example: Patch endoscopy tool tracking
./benchmarks/holoscan_flow_benchmarking/patch_application.sh applications/endoscopy_tool_tracking

# Restore original files when done
./benchmarks/holoscan_flow_benchmarking/restore_application.sh <app_directory>

Note: Source files are backed up as *.bak during patching.

Important notes#

Verify the application runs correctly after building and before proceeding with performance evaluation. For example, run the app ./holohub run endoscopy_tool_tracking --language python (append --local if you are using the bare-metal installation)
For applications using TensorRT, run once to generate engine files (e.g., for the endoscopy tool tracking application).
See patch_application.sh and restore_application.sh for more details about the patching process.

2. Run Benchmarks#

python benchmarks/holoscan_flow_benchmarking/benchmark.py -a <app_name> [options]

Key Options:

-a, --app: Application name
-r, --runs: Number of runs
-i, --instances: Instances per run
-m, --max-frames: Number of frames to process
--sched: Scheduler type
-d, --directory: Output directory
--run-command: Custom run command (if needed)

For a complete list of arguments, run:

python benchmarks/holoscan_flow_benchmarking/benchmark.py -h

Example:

# Run endoscopy tool tracking benchmark
python benchmarks/holoscan_flow_benchmarking/benchmark.py \
    -a endoscopy_tool_tracking \
    -r 3 -i 3 -m 200 \
    --sched greedy \
    -d myoutputs

Output Files:

Data flow logs: logger_<scheduler>_<run>_<instance>.log
GPU utilization: gpu_utilization_<scheduler>_<run>.csv

3. Analyze Results#

3.1 Basic Analysis#

python benchmarks/holoscan_flow_benchmarking/analyze.py \
    -g myoutputs/logger_greedy_* MyCustomGroup \
    -m -a  # Show max and average latencies

Latency Stats

3.2 Generate CDF Plot#

python benchmarks/holoscan_flow_benchmarking/analyze.py \
    --draw-cdf https://github.com/nvidia-holoscan/holohub/blob/main/benchmarks/holoscan_flow_benchmarking/single_path_cdf.png?raw=true \
    -g myoutputs/logger_greedy_* MyCustomGroup \
    --no-display-graphs

CDF Curve

3.3 Historical Analysis#

python bar_plot_avg_datewise.py \
    avg_values_2023-10-{19,20,21}.csv \
    stddev_values_2023-10-{19,20,21}.csv

Historical Data

Generate Application Graph with Latency Numbers#

The app_perf_graph.py script can be used to generate a graph of a Holoscan application with latency data from benchmarking embedded into the graph. The graph looks like the figure below, where graph nodes are operators along with their average and maximum execution times, and edges represent connection between operators along with the average and maximum data transfer latencies.

Application Performance Graph

Monitor application performance in real-time#

# Terminal 1: Run benchmark
python3 benchmarks/holoscan_flow_benchmarking/benchmark.py \
    -a endoscopy_tool_tracking \
    -i 1 -r 3 -m 1000 \
    --sched=greedy \
    -d endoscopy_results

# Terminal 2: Generate live graph
python3 benchmarks/holoscan_flow_benchmarking/app_perf_graph.py \
    -o live_app_graph.dot \
    -l endoscopy_results

# Terminal 3: View live graph
xdot live_app_graph.dot

Holoscan Flow Benchmarking for HoloHub#