Skip to content

Real-Time End-to-end AI Surgical Video Workflow#

Authors: Holoscan Team (NVIDIA)
Supported platforms: x86_64, aarch64
Language: Python
Last modified: August 5, 2025
Latest version: 1.0
Minimum Holoscan SDK version: 3.0.0
Tested Holoscan SDK versions: 3.1.0
Contribution metric: Level 1 - Highly Reliable

Overall Diagram Fig.1: The overall diagram illustrating the end-to-end pipeline for real-time AI surgical video processing. The pipeline achieves an average end-to-end latency of 37ms (maximum 54ms). Key latency components are shown: Holoscan Sensor Bridge (HSB) latency averages 21ms (max 28ms), and the AI application averages 16ms (median 17ms, 95th percentile 18ms, 99th percentile 22ms, max 26ms). These results demonstrate the solution's high-performance, low-latency capabilities for demanding surgical video applications.

Overview#

This reference application offers developers a modular, end-to-end pipeline that spans the entire sensor processing workflow—from sensor data ingestion and accelerated computing to AI inference, real-time visualization, and data stream output.

Specifically, we demonstrate a comprehensive real-time end-to-end AI surgical video pipeline that includes:

  1. Sensor I/O: Integration with Holoscan Sensor Bridge, enabling GPU Direct data ingestion for ultra low-latency input of surgical video feeds.
  2. Out-of-body detection to determine if the endoscope is inside or outside the patient's body, ensuring patient privacy by removing identifiable information.
  3. Dynamic flow condition based on out-of-body detection results.
  4. De-identification: pixelate the image to anonymize outside of body elements like people's faces.
  5. Multi-AI: Enabling simultaneous execution of multiple models at inference. Surgical Tool processing with:
  6. SSD detection for surgical tool detection
  7. MONAI segmentation for endoscopic tool segmentation

Architecture#

RAISVP-workflow Fig.2: The workflow diagram representing all the holoscan operators (in green) and holoscan sensor bridge operators (in yellow). The source can be a Holoscan Sensor Bridge, an AJA Card or a video replayer.

Sample Output Images Fig.3: Endoscopy image from a partial nephrectomy procedure (surgical removal of the diseased portion of the kidney) showing AI tool segmentation results when the camera is inside the body and a deidentified (pixelated) output image when the camera is outside of the body.

1. Out-of-Body Detection#

The workflow first determines if the endoscope is inside or outside the patient's body using an AI model.

2. Dynamic Flow Control#

  • If outside the body: The video is deidentified through pixelation to protect privacy
  • If inside the body: The video is processed by the multi-AI pipeline

3. Multi-AI Processing#

When inside the body, two AI models run concurrently:

  • SSD detection model identifies surgical tools with bounding boxes
  • MONAI segmentation model provides pixel-level segmentation of tools

4. Visualization#

The HolovizOp displays the processed video with overlaid AI results, including:

  • Bounding boxes around detected tools
  • Segmentation masks for tools
  • Text labels for detected tools

Requirements#

Software#

  • Holoscan SDK >= v3.0: Holohub command takes care of this dependency when using Holohub container. However, you can install the Holoscan SDK via one of the methods specified in the SDK user guide.
  • Holoscan Sensor Bridge >= v2.0: Please see the Quick start guide for building the Holoscan Sensor Bridge docker container.

Models#

This workflow utilizes the following three AI models:

Model Description File
📦️ Out-of-body Detection Model Detects if endoscope is inside or outside the body anonymization_model.onnx
📦️ SSD Detection for Endoscopy Surgical Tools Detects surgical tools with bounding boxes epoch24_nms.onnx
📦️ MONAI Endoscopic Tool Segmentation Provides pixel-level segmentation of tools model_endoscopic_tool_seg_sanitized_nhwc_in_nchw_out.onnx

Sample Data#

Note: The directory specified by --data at runtime is assumed to contain three subdirectories, corresponding to the NGC resources specified in Models and Sample Data: orsi, monai_tool_seg_model and ssd_model. These resources will be automatically downloaded to the Holohub data directory when building the application.

Quick Start Guide#

./holohub run ai_surgical_video

This single command will create and launch Holohub container, build the workflow, and run the workflow with the default arguments set in the config.yaml file and a replayer source.

Using AJA Card as I/O#

./holohub run ai_surgical_video --run-args "--source aja"

Using Holoscan Sensor Bridge as I/O#

When using the workflow with --source hsb, it requires the Holoscan Sensor Bridge software to be installed. You can build a Holoscan Sensor Bridge container using the following commands:

git clone https://github.com/nvidia-holoscan/holoscan-sensor-bridge.git
cd holoscan-sensor-bridge
git checkout hsdk-3.0
./docker/build.sh --dgpu # for discrete GPU
./docker/build.sh --igpu # for integrated GPU

This will build a docker image called hololink-demo:2.0.0.

Once you have built the Holoscan Sensor Bridge container, you can build the Holohub container and run the workflow using the following command:

./holohub run --base-img hololink-demo:2.0.0 --img holohub:link ai_surgical_video --run-args="--source hsb"

Advanced Usage#

Using Holohub Container#

First, you need to run the Holohub container:

./holohub run-container ai_surgical_video

Note: If using Holoscan Sensor Bridge, please see the Using Holoscan Sensor Bridge as I/O for building the Holoscan Sensor Bridge docker container first, which is tagged as hololink-demo:2.0.0, and then use the following command to run the Holohub container:

./holohub run-container --base-img hololink-demo:2.0.0 --img holohub:link

Building the Application#

Once your environment is set up, you can build the workflow using the following command:

./holohub build ai_surgical_video

Running the Application#

Use Holohub Container from Outside of the Container#

Using the Holohub container, you can run the workflow without building it again:

./holohub run ai_surgical_video --no-build

However, if you want to build the workflow, you can just remove the --no-build flag:

./holohub run ai_surgical_video

Alternatively, you can run the application directly from the source directory:

cd <HOLOHUB_SOURCE_DIR>/workflows/ai_surgical_video/python
python3 ai_surgical_video.py --source hsb --data <DATA_DIR> --config <CONFIG_FILE>

TIP: You can get the exact "Run command" along with "Run environment" and "Run workdir" by executing:

./holohub run ai_surgical_video --dryrun --local

Command Line Arguments#

The application accepts the following command line arguments:

Argument Description Default
-s, --source Source of video input: replayer, aja, or hsb replayer
-c, --config Path to a custom configuration file config.yaml in the application directory
-d, --data Path to the data directory containing model and video files Uses the HOLOHUB_DATA_PATH environment variable
--headless Run in headless mode (no visualization) False
--fullscreen Run in fullscreen mode False
--camera-mode Camera mode to use [0,1,2,3] 0
--frame-limit Exit after receiving this many frames No limit
--hololink IP address of Hololink board 192.168.0.2
--log-level Logging level to display 20
--ibv-name IBV device to use First available InfiniBand device
--ibv-port Port number of IBV device 1
--expander-configuration I2C Expander configuration (0 or 1) 0
--pattern Configure to display a test pattern (0-11) None
--ptp-sync After reset, wait for PTP time to synchronize False
--skip-reset Don't call reset on the hololink device False

Benchmarking#

Please refer to Holoscan Benchmarking for how to perform benchmarking for this workflow.