Skip to content

Distributed H.264 Endoscopy Tool Tracking Application

Authors: Holoscan Team (NVIDIA)
Supported platforms: x86_64, aarch64
Last modified: March 18, 2025
Language: C++
Latest version: 1.0
Minimum Holoscan SDK version: 2.6.0
Tested Holoscan SDK versions: 2.6.0
Contribution metric: Level 0 - Core Stable

This application is similar to the H.264 Endoscopy Tool Tracking application, but this distributed version divides the application into three fragments:

  1. Video Input: get video input from a pre-recorded video file.
  2. Inference: run the inference using LSTM and run the post-processing script.
  3. Visualization: display input video and inference results.

Requirements

This application is configured to use H.264 elementary stream from endoscopy sample data as input.

Data

📦️ (NGC) Sample App Data for AI-based Endoscopy Tool Tracking

The data is automatically downloaded when building the application.

Building and Running H.264 Endoscopy Tool Tracking Application

  • Building and running the application from the top level Holohub directory:

C++

# Start the application with all three fragments
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language cpp

# Use the following commands to run the same application three processes:
# Start the application with the video_in fragment
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language cpp --run_args "--driver --worker --fragments video_in --address :10000 --worker-address :10001"
# Start the application with the inference fragment
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language cpp --run_args "--worker --fragments inference --address :10000 --worker-address :10002"
# Start the application with the visualization fragment
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language cpp --run_args "--worker --fragments viz --address :10000 --worker-address :10003"

Python

# Start the application with all three fragments
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language python

# Use the following commands to run the same application three processes:
# Start the application with the video_in fragment
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language python --run_args "--driver --worker --fragments video_in --address :10000 --worker-address :10001"
# Start the application with the inference fragment
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language python --run_args "--worker --fragments inference --address :10000 --worker-address :10002"
# Start the application with the visualization fragment
./dev_container build_and_run ucx_h264_endoscopy_tool_tracking --language python --run_args "--worker --fragments viz --address :10000 --worker-address :10003"

Important: on aarch64, applications also need tegra folder mounted inside the container and the LD_LIBRARY_PATH environment variable should be updated to include tegra folder path.

Open and edit the Dockerfile and uncomment line 66:

# Uncomment the following line for aarch64 support
ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/aarch64-linux-gnu/tegra/

Dev Container

To start the VS Code Dev Container, run the following command from the root directory of Holohub:

./dev_container vscode h264

VS Code Launch Profiles

C++

Use the (gdb) ucx_h264_endoscopy_tool_tracking/cpp (all fragments) launch profile to run and debug the C++ application.

Python

Use the (pythoncpp) ucx_h264_endoscopy_tool_tracking/python (all fragments) launch profile to run and debug the Python application.