Skip to content

Depth Anything V2

Authors: Holoscan Team (NVIDIA)
Supported platforms: x86_64, aarch64
Last modified: March 18, 2025
Language: Python
Latest version: 1.0
Minimum Holoscan SDK version: 2.5.0
Tested Holoscan SDK versions: 2.8.0
Contribution metric: Level 2 - Trusted

This application uses the Depth Anything V2 model for monocular depth estimation. Monocular Depth Estimation refers to the task of predicting the distance of objects in a scene from a single 2D image captured by a standard camera.

Model

This application uses the Depth Anything V2 model from DepthAnythingV2 for monocular depth estimation. The model is downloaded when building the Docker image.

NOTE: The user is responsible for checking if the model license is suitable for the intended purpose.

Data

This application downloads a pre-recorded video from Pexels when the application is built. Please review the license terms from Pexels.

NOTE: The user is responsible for ensuring the dataset license is suitable for the intended purpose.

Input

This app currently supports two input options:

  1. v4l2 compatible input device (default; see V4L2 Support below)
  2. Pre-recorded video (see Video Replayer Support below)

Run Instructions

V4L2 Support

This application supports v4l2 compatible devices as input. To run this application with your v4l2 compatible device, please plug in your input device and run:

./dev_container build_and_run depth_anything_v2

By default, this application expects the input device to be mounted at /dev/video0. If this is not the case, update applications/depth_anything_v2/depth_anything_v2.yaml file to set the corresponding input device before running the application. You can also override the default input device on the command line by running:

./dev_container build_and_run depth_anything_v2 --run_args "--video_device /dev/video0"

Video Replayer Support

If you don't have a v4l2 compatible device plugged in, you can also run this application on a pre-recorded video. To launch the application using the Video Stream Replayer as the input source, run:

./dev_container build_and_run depth_anything_v2 --run_args "--source replayer"

Display Modes

This application has multiple display modes which you can toggle through using the left mouse button.

  • original: output the original image from input source
  • depth: output the color depthmap based on the depthmap returned from Depth Anything V2 model
  • side-by-side: output a side-by-side view of the original image next to the color depthmap
  • interactive: allow user

In interactive mode, the middle or right mouse button can be used to modify the ratio of original image vs color depthmap is shown.

Acknowledgement

This project is based on the following projects: - Depth-Anything-V2 - Depth Anything V2 - depth-anything-tensorrt - Depth Anything TensorRT CLI

Known Issues

There is a known issue running this application on IGX w/ iGPU and on Jetson AGX (see #500). The workaround is to update the device to avoid picking up the libnvv4l2.so library.

cd /usr/lib/aarch64-linux-gnu/
ls -l libv4l2.so.0.0.999999
sudo rm libv4l2.so.0.0.999999
sudo ln -s libv4l2.so.0.0.0.0  libv4l2.so.0.0.999999