Holoscan Flow Benchmarking for HoloHub¶
Authors: Holoscan Team (NVIDIA)
Supported platforms: x86_64, aarch64
Last modified: March 18, 2025
Latest version: 0.1.0
Minimum Holoscan SDK version: 1.0.3
Tested Holoscan SDK versions: 1.0.3
Contribution metric: Level 3 - Developmental
This is a benchmarking tool to evaluate the performance of HoloHub and other Holoscan applications. Following is a high-level overview of Holoscan Flow Benchmarking. For more details on its possible use-cases, please follow Holoscan Flow Benchmarking Tutorial (up-to-date) or Holoscan Flow Benchmarking whitepaper.
The tool supports benchmarking of any Holoscan application. Holoscan Python applications are supported since Holoscan v1.0.
Table of Contents¶
Pre-requisites¶
The following Python libraries need to be installed to run the benchmarking scripts (pip install -r requirements.txt
can be used):
numpy matplotlib nvitop argparse pydot xdot
The preferred way to use Flow Benchmarking is to use the provided Holohub docker image for automatic
management of dependencies. Otherwise, bare-metal installation of the required libraries is needed.
Extra dependencies might be needed to be installed to satisfy the requirements. For example, xdot
has following dependencies.
Steps for Holoscan Flow Benchmarking¶
- Patch the application for benchmarking
$ ./benchmarks/holoscan_flow_benchmarking/patch_application.sh <application directory>
For example, to patch the endoscopy tool tracking application, you would run:
$ ./benchmarks/holoscan_flow_benchmarking/patch_application.sh applications/endoscopy_tool_tracking
cpp
files in a *.cpp.bak
file.
- Build the application
$ ./run build <application name> <other options> --benchmark
Please make sure to test that the application runs correctly after building it, and before going to next steps of performance evaluation. You may also make sure that all the necessary TensorRT engine files are generated by running the application at least once, for example, for the endoscopy tool tracking application.
- Run the performance evaluation
$ python benchmarks/holoscan_flow_benchmarking/benchmark.py -a <application name> <other options>
The above command will run an application which is executed normally by
./run launch <application name> cpp
. If an application is executed differently, then use the
--run-command
argument to specify the command to run an application.
python benchmarks/holoscan_flow_benchmarking/benchmark.py -h
shows all the possible benchmarking options.
All the log filenames are printed out at the end of the evaluation. The format of the filename for the data flow tracking log files is:
logger_<scheduler>_<run_number>_<instance-id>.log
. The format of the filename for the GPU
utilization log files is: gpu_utilization_<scheduler>_<run_number>.csv
.
Example: When the endoscopy tool tracking application is evaluated for the greedy scheduler for 3 runs with 3 instances each for 200 number of data frames, the following output is printed:
$ python benchmarks/holoscan_flow_benchmarking/benchmark.py -a endoscopy_tool_tracking -r 3 -i 3 -m 200 --sched greedy -d myoutputs
Log directory is not found. Creating a new directory at /home/ubuntu/holoscan-sdk/holohub-internal/myoutputs
Run 1 completed for greedy scheduler.
Run 2 completed for greedy scheduler.
Run 3 completed for greedy scheduler.
Evaluation completed.
Log file directory: /home/ubuntu/holoscan-sdk/holohub/myoutputs
All the data flow tracking log files are: logger_greedy_1_1.log, logger_greedy_1_2.log, logger_greedy_1_3.log, logger_greedy_2_1.log, logger_greedy_2_2.log, logger_greedy_2_3.log, logger_greedy_3_1.log, logger_greedy_3_2.log, logger_greedy_3_3.log
- Get performance results and insights
$ python benchmarks/holoscan_flow_benchmarking/analyze.py -g <group of log files> <options>
python benchmarks/holoscan_flow_benchmarking/analyze.py -h
shows all the possible options.
Example:
For the above example experiment with the benchmark.py
script, we can analyze worst-case and
average end-to-end latency by the following script:
python benchmarks/holoscan_flow_benchmarking/analyze.py -m -a -g myoutputs/logger_greedy_* MyCustomGroup
We can also produce CDF curve of the observed latencies for a single path by the following commands:
$ python benchmarks/holoscan_flow_benchmarking/analyze.py --draw-cdf https://github.com/nvidia-holoscan/holohub/blob/main/benchmarks/holoscan_flow_benchmarking/single_path_cdf.png?raw=true -g myoutputs/logger_greedy_* MyCustomGroup --no-display-graphs
Saved the CDF curve graph of the first path of each group in: https://github.com/nvidia-holoscan/holohub/blob/main/benchmarks/holoscan_flow_benchmarking/single_path_cdf.png?raw=true
The https://github.com/nvidia-holoscan/holohub/blob/main/benchmarks/holoscan_flow_benchmarking/single_path_cdf.png?raw=true
looks like below:
A few auxiliary scripts are also provided to help plotting datewise results. For example, the following script plots the average end-to-end latency along with standard deviation for three consecutive dates:
python bar_plot_avg_datewise.py avg_values_2023-10-19.csv avg_values_2023-10-20.csv avg_values_2023-10-21.csv stddev_values_2023-10-19.csv stddev_values_2023-10-20.csv stddev_values_2023-10-21.csv
- Restore the application
If benchmarking is not necessary anymore, an application can be restored by the following command:
$ ./benchmarks/holoscan_flow_benchmarking/restore_application.sh <application directory>
Generate Application Graph with Latency Numbers¶
The app_perf_graph.py
script can be used to generate a graph of a Holoscan application with
latency data from benchmarking embedded into the graph. The graph looks like the figure below, where
graph nodes are operators along with their average and maximum execution times, and edges represent
connection between operators along with the average and maximum data transfer latencies.
It is also possible to generate such a graph and dynamically update it, while running the
benchmarking.py
script to benchmark an application. For example, the following three commands can
be run in three different terminals to monitor the live performance of an endoscopy_tool_tracking
application.
# the following command initiates a benchmarking job and generates performance log files
$ python3 benchmarks/holoscan_flow_benchmarking/benchmark.py -a endoscopy_tool_tracking -i 1 -d endoscopy_results --sched=greedy -r 3 -m 1000
# the following command keeps updating an application graph with the latest performance numbers
# -l means live mode
$ python3 benchmarks/holoscan_flow_benchmarking/app_perf_graph.py -o live_app_graph.dot -l endoscopy_results
# use another terminal to visualize the graph with xdot. the graph will be updated as app_perf_graph.py updates the graph
$ xdot live_app_graph.dot