Building an ST 2110 Operator for NVIDIA Holoscan: Real-Time Surgical Intelligence in the Operating Room

March 10, 2026

Introduction: Why Surgical Intelligence Needs Real-Time Video Ingest

Modern surgery is filled with data but starved of real-time understanding. Surgeons navigate millimetre-scale anatomy under immense pressure, and they often make life-saving decisions in fractions of a second. A tiny delay in visual feedback can be the difference between identifying a bleeding vessel early or reacting too late.

At XRlabs, our goal is not simply to improve visualisation, but to build the Surgical Intelligence layer that current operating rooms lack. We retrofit existing medical devices, including exoscopes, endoscopes, microscopes, and robotic arms, with real-time perception, guidance, and automation capabilities. Through our XRlabs XR platform and Surgical OS, this intelligence reaches the surgeon as an unobstructed, context-aware view of anatomy and risk cues.

In our recent first-in-human neurosurgical deployment, we combined an exoscope, NVIDIA Holoscan, Jetson AGX Thor, and a custom SMPTE ST 2110 operator to create an ultra-low-latency pipeline that enabled a surgeon to operate directly from a headset view. This was not a demonstration in a lab. It was a multi-hour clinical procedure in which reliability, latency, and stability had direct consequences for surgical decision-making. By keeping nearly all processing on the GPU and using Holoscan’s operator graph, we preserved reaction-critical response times during events such as unexpected bleeding or rapid field changes. This workflow represents an early example of Physical AI in the operating room. The operator we are contributing to HoloHub forms part of a larger arc: real-time tool understanding, anatomy-aware guidance, digital twins for robotics, and eventually semi-autonomous behaviour.

Our intention is to share this work in a form that others can adapt, extend, and build upon, whether they work in healthcare or any other field where high-bandwidth real-time sensing matters. In this blog, we describe how we built and deployed the operator, what we learned under clinical conditions, and how you can use and expand it within Holoscan.

Why ST 2110 Matters in Real-Time Systems

Although ST 2110 is widely used in broadcast environments, it aligns naturally with surgical and robotic workflows: - It offers uncompressed image quality, which is vital in domains where small visual details carry large consequences. - It provides predictable timing characteristics, which are essential for stable overlays, synchronised multi-sensor perception, and low-latency decision loops. - Many high-end imaging devices already output ST 2110 video, which enables intelligence upgrades without hardware replacement. In this neurosurgical pipeline, the exoscope’s ST 2110 feed became our primary high-fidelity signal for real-time processing on Jetson AGX Thor.

System Overview: From Optics to Head-Mounted Display

The intraoperative pipeline followed this structure: 1. Surgical Exoscope → ST 2110 multicast: High-resolution optics delivered uncompressed video into the OR network. 2. Jetson AGX Thor running Holoscan - Custom ST 2110 operator reconstructed RTP frames directly in GPU memory - CUDA-based colour conversion and format handling - A second operator performed forward-error-correction interleave for wireless transmission 3. Wireless transport → XR Headset: Surgeons viewed the live exoscope signal with negligible perceived latency, enhanced by XRlabs overlays. This configuration allowed us to maintain optical fidelity and temporal stability suitable for surgery.

Working with Holoscan

Whilst Holoscan is a relatively novel framework, the comprehensive documentation and maturing ecosystem gave us confidence to dive into creating our own operators. The holohub repository has excellent examples, but there were plenty of key learnings:

Use GXF’s standardised tensor and buffer types. This will allow for integration with other operators in your pipeline, make use of the zero copy efficiency and adds portability throughout the Nvidia ecosystem. Consider your schedulers; creating operators with multiple downstream outputs can result in backpressure on your graph, particularly for expensive fragments handling ML tasks. Consider multi thread or asynchronous scheduling patterns when dealing with time critical throughput.

Add Python bindings! Holoscan’s comprehensive Python support will help other developers get your operators into their flow, keeping the ecosystem thriving. The holohub build helper facilitates adding bindings so make use of it.

Building the ST 2110 Operator

Key design principles: 1. GPU-resident frame handling. Packet reconstruction occurs directly in device memory whenever possible, reducing jitter and maintaining alignment with downstream operators. 2. Jetson-friendly network ingestion. We used optimised kernel network interfaces rather than DPDK, DOCA, or Rivermax, which ensures broad compatibility across Jetson devices. 3. Configurable output formats. The operator supports NV12 and RGB outputs, allowing integration with both hardware codec paths and XR rendering pipelines. 4. HoloHub compliance. The operator follows HoloHub conventions for structure, configuration, documentation, and example applications. It is intentionally a version 0.1: narrow, practical, and ready for community extension.

Example Holoscan Application

from holoscan.core import Application
from xr_st2110 import St2110SourceOp
from xr_fec import FecInterleaveOp
from holoscan.operators import VideoRendererOp

class NeuroExoscopeApp(Application):
    def compose(self):
        st2110 = St2110SourceOp(
            self,
            ip="239.1.2.3",
            port=5004,
            width=3840,
            height=2160,
            framerate=60,
            output_format="rgb"
        )

        fec = FecInterleaveOp(self, block_size=20, redundancy=4)
        display = VideoRendererOp(self)

        self.add_flow(st2110, fec, {("video", "in")})
        self.add_flow(fec, display, {("video", "receivers")})

Lessons from a Real Neurosurgical Deployment

Working inside an operating theatre revealed behaviours that are difficult to replicate in synthetic tests: - Latency must be stable across fluctuating network conditions. - Maintaining GPU residency greatly reduces timing variability. - Head-mounted displays change operator behaviour, reducing the need to shift visual attention. - Consistent ingest improves every downstream process, from overlays to AI inference.

These insights informed our design for robustness rather than theoretical optimisation.

Beyond Surgery: Enabling Physical AI Across Domains

Although our motivation was surgical intelligence, this operator is broadly relevant to anyone building real-time systems that depend on: - High-bandwidth sensor ingest - Deterministic timing - GPU-first processing - Low-latency visualisation or control loops - Multi-operator Holoscan graphs - Digital twins or robotics workflows - Edge compute on Jetson Orin or Thor

Applications include: - Industrial robotics and teleoperation - Remote inspection and maintenance systems - Live-media processing and virtual production - Autonomous drones and aerial imaging - High-fidelity simulation or digital twin systems - Real-time environmental or scientific instrumentation The goal is to provide a practical reference implementation for ST 2110 ingest on Holoscan that others can adapt to their domain-specific needs.

Conclusion

This first deployment demonstrated that Holoscan, Jetson AGX Thor, and a carefully engineered ST 2110 pipeline can deliver real-time performance in one of the most demanding environments possible. The operator serves as a foundation for systems that rely on high-fidelity video as a primary signal, whether for perception, decision-making, or control.

By contributing this operator to HoloHub, we hope to support the wider developer community in building the next generation of Physical AI applications, from intelligent surgical systems to advanced robotics, simulation, and live-media workflows. Real-time intelligence begins with reliable, high-quality sensor ingest, and we are excited to see how others will build upon this foundation.