Real-Time End-to-end AI Surgical Video Workflow#
Authors: Holoscan Team (NVIDIA)
Supported platforms: x86_64, aarch64
Language: Python
Last modified: August 5, 2025
Latest version: 1.0
Minimum Holoscan SDK version: 3.0.0
Tested Holoscan SDK versions: 3.1.0
Contribution metric: Level 1 - Highly Reliable
Fig.1: The overall diagram illustrating the end-to-end pipeline for real-time AI surgical video processing. The pipeline achieves an average end-to-end latency of 37ms (maximum 54ms). Key latency components are shown: Holoscan Sensor Bridge (HSB) latency averages 21ms (max 28ms), and the AI application averages 16ms (median 17ms, 95th percentile 18ms, 99th percentile 22ms, max 26ms). These results demonstrate the solution's high-performance, low-latency capabilities for demanding surgical video applications.
Overview#
This reference application offers developers a modular, end-to-end pipeline that spans the entire sensor processing workflow—from sensor data ingestion and accelerated computing to AI inference, real-time visualization, and data stream output.
Specifically, we demonstrate a comprehensive real-time end-to-end AI surgical video pipeline that includes:
- Sensor I/O: Integration with Holoscan Sensor Bridge, enabling GPU Direct data ingestion for ultra low-latency input of surgical video feeds.
- Out-of-body detection to determine if the endoscope is inside or outside the patient's body, ensuring patient privacy by removing identifiable information.
- Dynamic flow condition based on out-of-body detection results.
- De-identification: pixelate the image to anonymize outside of body elements like people's faces.
- Multi-AI: Enabling simultaneous execution of multiple models at inference. Surgical Tool processing with:
- SSD detection for surgical tool detection
- MONAI segmentation for endoscopic tool segmentation
Architecture#
Fig.2: The workflow diagram representing all the holoscan operators (in green) and holoscan sensor bridge operators (in yellow). The source can be a Holoscan Sensor Bridge, an AJA Card or a video replayer.
Fig.3: Endoscopy image from a partial nephrectomy procedure (surgical removal of the diseased portion of the kidney) showing AI tool segmentation results when the camera is inside the body and a deidentified (pixelated) output image when the camera is outside of the body.
1. Out-of-Body Detection#
The workflow first determines if the endoscope is inside or outside the patient's body using an AI model.
2. Dynamic Flow Control#
- If outside the body: The video is deidentified through pixelation to protect privacy
- If inside the body: The video is processed by the multi-AI pipeline
3. Multi-AI Processing#
When inside the body, two AI models run concurrently:
- SSD detection model identifies surgical tools with bounding boxes
- MONAI segmentation model provides pixel-level segmentation of tools
4. Visualization#
The HolovizOp displays the processed video with overlaid AI results, including:
- Bounding boxes around detected tools
- Segmentation masks for tools
- Text labels for detected tools
Requirements#
Software#
- Holoscan SDK
>= v3.0
: Holohub command takes care of this dependency when using Holohub container. However, you can install the Holoscan SDK via one of the methods specified in the SDK user guide. - Holoscan Sensor Bridge
>= v2.0
: Please see the Quick start guide for building the Holoscan Sensor Bridge docker container.
Models#
This workflow utilizes the following three AI models:
Model | Description | File |
---|---|---|
📦️ Out-of-body Detection Model | Detects if endoscope is inside or outside the body | anonymization_model.onnx |
📦️ SSD Detection for Endoscopy Surgical Tools | Detects surgical tools with bounding boxes | epoch24_nms.onnx |
📦️ MONAI Endoscopic Tool Segmentation | Provides pixel-level segmentation of tools | model_endoscopic_tool_seg_sanitized_nhwc_in_nchw_out.onnx |
Sample Data#
- 📦️ Orsi partial nephrectomy procedures - Sample endoscopy video data for use with the
replayer
source
Note: The directory specified by
--data
at runtime is assumed to contain three subdirectories, corresponding to the NGC resources specified in Models and Sample Data:orsi
,monai_tool_seg_model
andssd_model
. These resources will be automatically downloaded to the Holohub data directory when building the application.
Quick Start Guide#
./holohub run ai_surgical_video
This single command will create and launch Holohub container, build the workflow, and run the workflow with the default arguments set in the config.yaml file and a replayer source.
Using AJA Card as I/O#
./holohub run ai_surgical_video --run-args "--source aja"
Using Holoscan Sensor Bridge as I/O#
When using the workflow with --source hsb
, it requires the Holoscan Sensor Bridge software to be installed. You can build a Holoscan Sensor Bridge container using the following commands:
git clone https://github.com/nvidia-holoscan/holoscan-sensor-bridge.git
cd holoscan-sensor-bridge
git checkout hsdk-3.0
./docker/build.sh --dgpu # for discrete GPU
./docker/build.sh --igpu # for integrated GPU
This will build a docker image called hololink-demo:2.0.0
.
Once you have built the Holoscan Sensor Bridge container, you can build the Holohub container and run the workflow using the following command:
./holohub run --base-img hololink-demo:2.0.0 --img holohub:link ai_surgical_video --run-args="--source hsb"
Advanced Usage#
Using Holohub Container#
First, you need to run the Holohub container:
./holohub run-container ai_surgical_video
Note: If using Holoscan Sensor Bridge, please see the Using Holoscan Sensor Bridge as I/O for building the Holoscan Sensor Bridge docker container first, which is tagged as
hololink-demo:2.0.0
, and then use the following command to run the Holohub container:./holohub run-container --base-img hololink-demo:2.0.0 --img holohub:link
Building the Application#
Once your environment is set up, you can build the workflow using the following command:
./holohub build ai_surgical_video
Running the Application#
Use Holohub Container from Outside of the Container#
Using the Holohub container, you can run the workflow without building it again:
./holohub run ai_surgical_video --no-build
However, if you want to build the workflow, you can just remove the --no-build
flag:
./holohub run ai_surgical_video
Alternatively, you can run the application directly from the source directory:
cd <HOLOHUB_SOURCE_DIR>/workflows/ai_surgical_video/python
python3 ai_surgical_video.py --source hsb --data <DATA_DIR> --config <CONFIG_FILE>
TIP: You can get the exact "Run command" along with "Run environment" and "Run workdir" by executing:
./holohub run ai_surgical_video --dryrun --local
Command Line Arguments#
The application accepts the following command line arguments:
Argument | Description | Default |
---|---|---|
-s, --source |
Source of video input: replayer , aja , or hsb |
replayer |
-c, --config |
Path to a custom configuration file | config.yaml in the application directory |
-d, --data |
Path to the data directory containing model and video files | Uses the HOLOHUB_DATA_PATH environment variable |
--headless |
Run in headless mode (no visualization) | False |
--fullscreen |
Run in fullscreen mode | False |
--camera-mode |
Camera mode to use [0,1,2,3] | 0 |
--frame-limit |
Exit after receiving this many frames | No limit |
--hololink |
IP address of Hololink board | 192.168.0.2 |
--log-level |
Logging level to display | 20 |
--ibv-name |
IBV device to use | First available InfiniBand device |
--ibv-port |
Port number of IBV device | 1 |
--expander-configuration |
I2C Expander configuration (0 or 1) | 0 |
--pattern |
Configure to display a test pattern (0-11) | None |
--ptp-sync |
After reset, wait for PTP time to synchronize | False |
--skip-reset |
Don't call reset on the hololink device | False |
Benchmarking#
Please refer to Holoscan Benchmarking for how to perform benchmarking for this workflow.