Announcing Holoscan 4.0: The Deploy Stack for Physical AI & Raw2Insights Sensors

Written March 10, 2026 by Jay Carlson

Physical AI systems are fundamentally different from traditional AI workloads. Instead of processing static datasets or running an inference pipeline between a single camera and display, these systems must continuously ingest high-bandwidth, multi-modal, multi-rate sensor data, run complex AI models, and take real-time actions in the physical world.

Holoscan 4.0 is the first major release that tackles the challenges of deploying AI and accelerated compute for physical AI systems, including humanoid and industrial robotics, automotive, laboratory, and active-scanning sensor systems.

GPU-Resident Graphs

Holoscan 4.0 introduces the ability to execute entire compute graphs on the GPU. Rather than repeatedly moving execution between CPU and GPU, GPU Resident Graphs run end-to-end on the GPU — all with no CPU scheduling. This dramatically reduces jitter to improve determinism, and can also reduces the API surface area needed for testing highly-regulated products.

Pub/Sub Framework — A Distributed Runtime for Processing Pipelines

Holoscan 4.0 introduces native support for pub/sub messaging, enabling developers to create multiple Holoscan graphs across application boundaries and communicate with them dynamically. You get modular multi-process graphs, fault isolation between pipeline components, and distributed deployment across machines.

The feature is transport-agnostic; users can create concrete implementations using whichever message-passing framework they prefer. We're launching a Fast DDS example in HoloHub soon.

Other Features

GStreamer Interoperability. Many camera and media pipelines rely on GStreamer. Developers can now pass data between Holoscan graphs and Gstreamer pipelines — in both directions — opening up a rich ecosystem of existing multimedia codecs, sources, sinks, and streaming.

EtherCAT support. Holoscan now includes an EtherCAT, enabling integration with industrial robotics and motion control: robot joint controllers, motor drives, industrial actuators, and high-precision motion systems. That integration makes it possible to build applications that combine AI perception with deterministic robotic control.

Pose Tree. Many physical systems depend on complex spatial relationships between sensors, robots, and the environment. Holoscan includes native Pose Tree support to manage coordinate transforms between sensors, robot links, tools, and world frames.

The Road to Raw-to-Insights On Active-Scanning Sensors

A novel class of physical AI systems are active-scanning sensors — systems that emit energy into the environment and analyze the reflected signal coming back, measuring distance, structure, velocity, or material properties. Automotive radar, ultrasound scanners, and MRI machines used for industrial and medical imaging are examples of these systems. Like robotics and other physical AI systems, active-scanning sensors have control loops to steer the output based on feedback received from inputs.

Historically, the TX path — responsible for generating waveforms, beam steering, or modulation sequences — was implemented in FPGA logic, where deterministic timing enabled cycle-accurate control of DACs, phased arrays, and RF front-ends.

These are complex systems that require mountains of Verilog code, taking hours to recompile when even small changes are made. And LLM-based AI coding assistants are notoriously bad at enhancing developer productivity for FPGA development.

While developers would love to move these TX algorithms to computing platforms using modern languages like C++ or Python, traditional CPUs introduce massive amounts of jitter and latency that disrupt the timing-critical workload of these pipelines.

GPU Resident Graphs enables these systems to move toward software-defined TX pipelines, where waveform generation and adaptive scanning algorithms are written with CUDA, running directly on the GPU — not in Verilog running on FPGAs.

On the RX side, classical signal-processing algorithms such as beamforming, matched filtering, Doppler processing, and FFT-based spectral analysis processed the returning signal before any higher-level interpretation occurred. These signal-processing techniques leave vast quantities of data on the table. They make too many assumptions about the environment (like an ultrasound probe's speed of sound through tissue), and mask out "noise" that might actually be signal.

Times have changed, though. Jetson Thor is the first low-power SoC to provide more than 2000 AI TOPS of performance, and Blackwell-powered RTX Pro GPUs provide even more firepower. Platforms running Holoscan Sensor Bridge can ingress 100s of Gbps of data with near-zero CPU intervention.

The time is right to bring high-bandwidth real-time raw-to-insights AI inference to these sensor systems.

Holoscan makes it possible to build these systems as software — not hardware. Instead of writing thousands of lines of Verilog and waiting hours for FPGA synthesis, developers can implement real-time sensor pipelines in CUDA and C++, integrate AI models directly into the data path, and iterate at software speed. If you’re building the next generation of radar, ultrasound, robotics, or industrial sensing platforms, download the Holoscan SDK and start building your first Raw-to-Insights pipeline today.