Distributed H.264 Endoscopy Tool Tracking Application ¶

Authors: Holoscan Team (NVIDIA)
Supported platforms: x86_64, aarch64
Last modified: March 18, 2025
Language: C++
Latest version: 1.0
Minimum Holoscan SDK version: 2.6.0
Tested Holoscan SDK versions: 2.6.0
Contribution metric: Level 0 - Core Stable

This application is similar to the H.264 Endoscopy Tool Tracking application, but this distributed version divides the application into three fragments:

Video Input: get video input from a pre-recorded video file.
Inference: run the inference using LSTM and run the post-processing script.
Visualization: display input video and inference results.

Requirements¶

This application is configured to use H.264 elementary stream from endoscopy sample data as input.

Data¶

📦️ (NGC) Sample App Data for AI-based Endoscopy Tool Tracking

The data is automatically downloaded when building the application.

Building and Running H.264 Endoscopy Tool Tracking Application¶

Building and running the application from the top level Holohub directory:

C++¶

# Start the application with all three fragments
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language cpp

# Use the following commands to run the same application three processes:
# Start the application with the video_in fragment
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language cpp --run_args "--driver --worker --fragments video_in --address :10000 --worker-address :10001"
# Start the application with the inference fragment
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language cpp --run_args "--worker --fragments inference --address :10000 --worker-address :10002"
# Start the application with the visualization fragment
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language cpp --run_args "--worker --fragments viz --address :10000 --worker-address :10003"

Python¶

# Start the application with all three fragments
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language python

# Use the following commands to run the same application three processes:
# Start the application with the video_in fragment
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language python --run_args "--driver --worker --fragments video_in --address :10000 --worker-address :10001"
# Start the application with the inference fragment
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language python --run_args "--worker --fragments inference --address :10000 --worker-address :10002"
# Start the application with the visualization fragment
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language python --run_args "--worker --fragments viz --address :10000 --worker-address :10003"

Important: on aarch64, applications also need tegra folder mounted inside the container and the LD_LIBRARY_PATH environment variable should be updated to include tegra folder path.

Open and edit the Dockerfile and uncomment line 66:

# Uncomment the following line for aarch64 support
ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/aarch64-linux-gnu/tegra/

Dev Container¶

To start the VS Code Dev Container, run the following command from the root directory of Holohub:

./dev_container vscode h264

VS Code Launch Profiles¶

C++¶

Use the (gdb) ucx_h264_endoscopy_tool_tracking/cpp (all fragments) launch profile to run and debug the C++ application.

Python¶

Use the (pythoncpp) ucx_h264_endoscopy_tool_tracking/python (all fragments) launch profile to run and debug the Python application.

Distributed H.264 Endoscopy Tool Tracking Application¶

Requirements¶

Data¶

Building and Running H.264 Endoscopy Tool Tracking Application¶

C++¶

Python¶

Dev Container¶

VS Code Launch Profiles¶

C++¶

Python¶

Distributed H.264 Endoscopy Tool Tracking Application ¶