📷🤖 Florence-2¶
Authors: Holoscan Team (NVIDIA)
Supported platforms: x86_64, aarch64
Last modified: March 18, 2025
Language: Python
Latest version: 1.0.0
Minimum Holoscan SDK version: 2.1.0
Tested Holoscan SDK versions: 2.1.0
Contribution metric: Level 1 - Highly Reliable
This application demonstrates how to run the Florence-2 models on a live video feed with the possibility of changing the task and optional prompt via a QT UI.
Note: This demo currently uses Florence-2-large-ft, but any of the Florence-2 models should work as long as the correct URLs and names are used in Dockerfile and config.yaml: - Florence-2-large-ft - Florence-2-large - Florence-2-base-ft - Florence-2-base
⚙️ Setup Instructions¶
The app defaults to using the video device at /dev/video0
Note: You can use a USB webcam as the video source, or an MP4 video by following the instructions for the V4L2_Camera example app.
To debug if this is the correct device download v4l2-ctl
:
sudo apt-get install v4l-utils
v4l2-ctl --list-devices
NVIDIA Tegra Video Input Device (platform:tegra-camrtc-ca):
/dev/media0
vi-output, lt6911uxc 2-0056 (platform:tegra-capture-vi:0):
/dev/video0
Dummy video device (0x0000) (platform:v4l2loopback-000):
/dev/video3
🚀 Build and Run Instructions¶
From the Holohub main directory run the following command:
./dev_container build_and_run florence-2-vision
💻 Supported Hardware¶
- IGX w/ dGPU
- x86 w/ dGPU
- IGX w/ iGPU and Jetson AGX supported with workaround
There is a known issue running this application on IGX w/ iGPU and on Jetson AGX (see #500). The workaround is to update the device to avoid picking up the libnvv4l2.so library.
cd /usr/lib/aarch64-linux-gnu/
ls -l libv4l2.so.0.0.999999
sudo rm libv4l2.so.0.0.999999
sudo ln -s libv4l2.so.0.0.0.0 libv4l2.so.0.0.999999
Dev Container¶
To start the the Dev Container, run the following command from the root directory of Holohub:
./dev_container vscode florence-2-vision
This command will build and configure a Dev Container using a Dockerfile that is ready to run the application.
VS Code Launch Profiles¶
There are two launch profiles configured for this application:
- (debugpy) florence-2-vision/python: Launch florence-2-vision using a launch profile that enables debugging of Python code.
- (pythoncpp) florence-2-vision/python: Launch florence-2-vision using a launch profile that enables debugging of Python and C++ code.