Endoscopy Depth Estimation¶
Authors: Holoscan Team (NVIDIA)
Supported platforms: x86_64, aarch64
Last modified: March 18, 2025
Language: Python
Latest version: 1.0
Minimum Holoscan SDK version: 0.6.0
Tested Holoscan SDK versions: 0.6.0
Contribution metric: Level 2 - Trusted
This application demonstrates the use of custom components for depth estimation and its rendering using Holoviz with triangle interpolation.
Requirements¶
- Python 3.8+
- OpenCV 4.8+
Data¶
📦️ (NGC) Sample App Data for Endoscopy
The data is automatically downloaded and converted to the correct format when building the application. If you want to manually convert the video data, please refer to the instructions for using the convert_video_to_gxf_entities script.
Model¶
📦️ (NGC) App Model for AI-based Endoscopy Depth Estimation
The model is automatically downloaded to the same folder as the data in ONNX format.
OpenCV-GPU¶
This application uses OpenCV with GPU acceleration during the preprocessing stage when it runs with Histogram Equalization (flag --clahe
or -c
).
Histogram equalization reduces the effect of specular reflections and improves the visual performance of the depth estimation overall. However,
using regular OpenCV datatypes leads to unnecessary I/O operations to transfer data from Holoscan Tensors to the CPU and back.
We show in this application how to blend together Holoscan Tensors and OpenCV's GPUMat
datatype to get rid of this issue in the CUDACLAHEOp
operator.
Compare it to CPUCLAHEOp
for reference.
To achieve an end-to-end GPU accelerated pipeline / application, the pre-processing operators shall support accessing the GPU memory (Holoscan Tensor)
directly without memory copy / movement in Holoscan SDK. This means that only libraries which implement the __cuda_array_interface__
and DLPack standards allow conversion from/to Holoscan Tensor, such as cuCIM.
OpenCV, however, does not implement neither the __cuda_array_interface__
nor the standard DLPack, and a little work is needed yet to use this library.
First, we convert CuPy arrays to GPUMat using a fix in OpenCV only available from 4.8.0 on. More information here.
This is done in the gpumat_from_cp_array
function. With a GPUMat
, we can now use any OpenCV-CUDA operations.
Once the GPUMat
processing has finished, we have to convert it back to a CuPy tensor with gpumat_to_cupy
.
Important: In order to run this application with CUDA acceleration, one must compile OpenCV with CUDA support.
We provide a sample Dockerfile to build a container based on Holoscan v2.1.0 with the latest version of OpenCV and CUDA support.
In case you use it, note that the variable CUDA_ARCH_BIN
must be modified according to your specific GPU
configuration. Refer to this site to find out your NVIDIA GPU architecture.
Workflows¶
This application can be run with or without Histogram Equalization (CLAHE) by toggling the label --clahe
.
With CLAHE¶
Fig. 1 Depth Estimation Application with CLAHE enabled
The pipeline uses a recorded endoscopy video file (generated by convert_video_to_gxf_entities
script) for input frames. Each input frame in the file is loaded by Video Stream Replayer and passed to the following two branches:
- In the first branch (top), the input frames are passed to the CUDACLAHEOp
,
then fed to the Format Converter
to convert their data type from uint8
to float32
, and finally fed to the InferenceOp
.
The result is then ingested by the DepthPostProcessingOp
, which converts the depth map
to uint8
and reorders its dimensions for rendering with Holoviz.
- In the second branch (bottom), the input frames are passed to a Format Converter
that resizes them. Its output is finally fed to the DepthPostProcessingOp
for
rendering with Holoviz.
Without CLAHE¶
Fig. 2 Depth Estimation Application with CLAHE disabled
The pipeline uses a recorded endoscopy video file (generated by convert_video_to_gxf_entities
script) for input frames. Each input frame in the file is loaded by Video Stream Replayer
and passed to a branch that firstly converts its data type to float32
and resizes it with a Format Converter.
Then, the preprocessed frames are fed to the InferenceOp
and mixed with the original video in the custom DepthPostProcessingOp
for
rendering with Holoviz.
Run Instructions¶
To run this application, you'll need to configure your PYTHONPATH environment variable to locate the necessary python libraries based on your Holoscan SDK installation type.
You should refer to the glossary for the terms defining specific locations within HoloHub.
If your Holoscan SDK installation type is:
- python wheels:
export PYTHONPATH=$PYTHONPATH:<HOLOHUB_BUILD_DIR>/python/lib
- otherwise:
export PYTHONPATH=$PYTHONPATH:<HOLOSCAN_INSTALL_DIR>/python/lib:<HOLOHUB_BUILD_DIR>/python/lib
This application should be run in the build directory of Holohub in order to load the GXF extensions. Alternatively, the relative path of the extensions in the corresponding yaml file can be modified to match path of the working directory.
Next, run the command to run the application:
cd <HOLOHUB_BUILD_DIR>
python3 <HOLOHUB_SOURCE_DIR>/applications/endoscopy_depth_estimation/endoscopy_depth_estimation.py --data=<DATA_DIR> --model=<MODEL_DIR> --clahe
Container Build & Run Instructions¶
Build container using Holoscan 2.0.0 NGC container as base image and built OpenCV with CUDA ARCH 8.6, 8.7 and 8.9 support for IGX Orin and Ampere and Ada Lovelace Architecture dGPUs. This application is currently not supported on iGPU.
Change directory to Holohub source directory¶
cd <HOLOHUB_SOURCE_DIR>
Build and run the application using the development container¶
./dev_container build_and_run endoscopy_depth_estimation
Dev Container¶
To start the the Dev Container, run the following command from the root directory of Holohub:
./dev_container vscode endoscopy_depth_estimation
This command will build and configure a Dev Container using a Dockerfile that is ready to run the application.
VS Code Launch Profiles¶
There are two launch profiles configured for this application:
- (debugpy) endoscopy_depth_estimation/python: Launch endoscopy_depth_estimation using a launch profile that enables debugging of Python code.
- (pythoncpp) endoscopy_depth_estimation/python: Launch endoscopy_depth_estimation using a launch profile that enables debugging of Python and C++ code.