Endoscopy Depth Estimation #

Authors: Holoscan Team (NVIDIA)
Supported platforms: x86_64, aarch64
Language: Python
Last modified: May 13, 2025
Latest version: 1.0
Minimum Holoscan SDK version: 0.6.0
Tested Holoscan SDK versions: 0.6.0
Contribution metric: Level 2 - Trusted

This application demonstrates the use of custom components for depth estimation and its rendering using Holoviz with triangle interpolation.

Requirements#

Python 3.8+
OpenCV 4.8+

Data#

📦️ (NGC) Sample App Data for Endoscopy

The data is automatically downloaded and converted to the correct format when building the application. If you want to manually convert the video data, please refer to the instructions for using the convert_video_to_gxf_entities script.

Model#

📦️ (NGC) App Model for AI-based Endoscopy Depth Estimation

The model is automatically downloaded to the same folder as the data in ONNX format.

OpenCV-GPU#

This application uses OpenCV with GPU acceleration during the preprocessing stage when it runs with Histogram Equalization (flag --clahe or -c). Histogram equalization reduces the effect of specular reflections and improves the visual performance of the depth estimation overall. However, using regular OpenCV datatypes leads to unnecessary I/O operations to transfer data from Holoscan Tensors to the CPU and back. We show in this application how to blend together Holoscan Tensors and OpenCV's GPUMat datatype to get rid of this issue in the CUDACLAHEOp operator. Compare it to CPUCLAHEOp for reference.

To achieve an end-to-end GPU accelerated pipeline / application, the pre-processing operators shall support accessing the GPU memory (Holoscan Tensor) directly without memory copy / movement in Holoscan SDK. This means that only libraries which implement the __cuda_array_interface__ and DLPack standards allow conversion from/to Holoscan Tensor, such as cuCIM. OpenCV, however, does not implement neither the __cuda_array_interface__ nor the standard DLPack, and a little work is needed yet to use this library.

First, we convert CuPy arrays to GPUMat using a fix in OpenCV only available from 4.8.0 on. More information here. This is done in the gpumat_from_cp_array function. With a GPUMat, we can now use any OpenCV-CUDA operations. Once the GPUMat processing has finished, we have to convert it back to a CuPy tensor with gpumat_to_cupy.

Important: In order to run this application with CUDA acceleration, one must compile OpenCV with CUDA support. We provide a sample Dockerfile to build a container based on Holoscan v2.1.0 with the latest version of OpenCV and CUDA support. In case you use it, note that the variable CUDA_ARCH_BIN must be modified according to your specific GPU configuration. Refer to this site to find out your NVIDIA GPU architecture.

Workflows#

This application can be run with or without Histogram Equalization (CLAHE) by toggling the label --clahe.

With CLAHE#

Fig. 1 Depth Estimation Application with CLAHE enabled

The pipeline uses a recorded endoscopy video file (generated by convert_video_to_gxf_entities script) for input frames. Each input frame in the file is loaded by Video Stream Replayer and passed to the following two branches: - In the first branch (top), the input frames are passed to the CUDACLAHEOp, then fed to the Format Converter to convert their data type from uint8 to float32, and finally fed to the InferenceOp. The result is then ingested by the DepthPostProcessingOp, which converts the depth map to uint8 and reorders its dimensions for rendering with Holoviz. - In the second branch (bottom), the input frames are passed to a Format Converter that resizes them. Its output is finally fed to the DepthPostProcessingOp for rendering with Holoviz.

Without CLAHE#

Fig. 2 Depth Estimation Application with CLAHE disabled

The pipeline uses a recorded endoscopy video file (generated by convert_video_to_gxf_entities script) for input frames. Each input frame in the file is loaded by Video Stream Replayer and passed to a branch that firstly converts its data type to float32 and resizes it with a Format Converter. Then, the preprocessed frames are fed to the InferenceOp and mixed with the original video in the custom DepthPostProcessingOp for rendering with Holoviz.

Run Instructions#

To run this application, you'll need to configure your PYTHONPATH environment variable to locate the necessary python libraries based on your Holoscan SDK installation type.

You should refer to the glossary for the terms defining specific locations within HoloHub.

If your Holoscan SDK installation type is:

python wheels:

export PYTHONPATH=$PYTHONPATH:<HOLOHUB_BUILD_DIR>/python/lib

otherwise:

export PYTHONPATH=$PYTHONPATH:<HOLOSCAN_INSTALL_DIR>/python/lib:<HOLOHUB_BUILD_DIR>/python/lib

This application should be run in the build directory of Holohub in order to load the GXF extensions. Alternatively, the relative path of the extensions in the corresponding yaml file can be modified to match path of the working directory.

Next, run the command to run the application:

cd <HOLOHUB_BUILD_DIR>
python3 <HOLOHUB_SOURCE_DIR>/applications/endoscopy_depth_estimation/endoscopy_depth_estimation.py --data=<DATA_DIR> --model=<MODEL_DIR> --clahe

Container Build & Run Instructions#

Build container using Holoscan 2.0.0 NGC container as base image and built OpenCV with CUDA ARCH 8.6, 8.7 and 8.9 support for IGX Orin and Ampere and Ada Lovelace Architecture dGPUs. This application is currently not supported on iGPU.

Change directory to Holohub source directory#

cd <HOLOHUB_SOURCE_DIR>

Build and run the application using the development container#

./holohub run endoscopy_depth_estimation

Dev Container#

To start the the Dev Container, run the following command from the root directory of Holohub:

./holohub vscode endoscopy_depth_estimation

This command will build and configure a Dev Container using a Dockerfile that is ready to run the application.

VS Code Launch Profiles#

There are two launch profiles configured for this application:

(debugpy) endoscopy_depth_estimation/python: Launch endoscopy_depth_estimation using a launch profile that enables debugging of Python code.
(pythoncpp) endoscopy_depth_estimation/python: Launch endoscopy_depth_estimation using a launch profile that enables debugging of Python and C++ code.

Endoscopy Depth Estimation#