Distributed Endoscopy Tool Tracking¶
Authors: Holoscan Team (NVIDIA)
Supported platforms: x86_64, aarch64
Last modified: March 18, 2025
Language: C++
Latest version: 1.0
Minimum Holoscan SDK version: 2.1.0
Tested Holoscan SDK versions: 2.1.0
Contribution metric: Level 0 - Core Stable
This application is similar to the Endoscopy Tool Tracking application, but the distributed version divides the application into three fragments:
- Video Input: get video input from a pre-recorded video file.
- Inference: run the inference using LSTM and run the post-processing script.
- Visualization: display input video and inference results.
Based on an LSTM (long-short term memory) stateful model, these applications demonstrate the use of custom components for tool tracking, including composition and rendering of text, tool position, and mask (as heatmap) combined with the original video stream.
Requirements¶
The provided applications are configured to use a pre-recorded endoscopy video (replayer).
Data¶
📦️ (NGC) Sample App Data for AI-based Endoscopy Tool Tracking
The data is automatically downloaded and converted to the correct format when building the application. If you want to manually convert the video data, please refer to the instructions for using the convert_video_to_gxf_entities script.
Run Instructions¶
# Build the Holohub container for the Distributed Endoscopy Tool Tracking application
./dev_container build --docker_file applications/distributed/ucx/ucx_endoscopy_tool_tracking/Dockerfile --img holohub:ucx_endoscopy_tool_tracking
# Launch the container
./dev_container launch --img holohub:ucx_endoscopy_tool_tracking
# Build the Distributed Endoscopy Tool Tracking application
./run build ucx_endoscopy_tool_tracking
# Generate the TRT engine file from onnx
python3 utilities/generate_trt_engine.py --input data/endoscopy/tool_loc_convlstm.onnx --output data/endoscopy/engines/ --fp16
# Start the application with all three fragments
./run launch ucx_endoscopy_tool_tracking cpp
# Once you have completed the step to generate the TRT engine file, you may exit the container and
# use the following commands to run the application in distributed mode:
# Start the application with the video_in fragment
./dev_container build_and_run ucx_endoscopy_tool_tracking --language cpp --run_args "--driver --worker --fragments video_in --address :9999"
# Start the application with the inference fragment
./dev_container build_and_run ucx_endoscopy_tool_tracking --language cpp --run_args "--worker --fragments inference --address :9999"
# Start the application with the visualization fragment
./dev_container build_and_run ucx_endoscopy_tool_tracking --language cpp --run_args "--worker --fragments viz --address :9999"