HSB RoCE Receiver Operator with No Host Metadata#
Authors: Holoscan Team (NVIDIA)
Supported platforms: aarch64
Language: C++
Last modified: March 13, 2026
Latest version: 0.1.0
Minimum Holoscan SDK version: 4.0.0
Tested Holoscan SDK versions: 4.0.0
Contribution metric: Level 1 - Highly Reliable
This operator receives video frames over RoCE (RDMA over Converged Ethernet) directly to GPU memory for high-performance, low-latency video ingestion. It has an option to skip metadata processing on the host, so that metadata is never copied from the device to the host CPU. This is useful for Holoscan SDK GPU-resident applications.
This operator forks the Holoscan Sensor Bridge 2.5.0 RoCE Receiver Operator implementation and extends it with frame_memory_base access to support GPU Resident operations.
Overview#
The RoceReceiverOp is a Holoscan operator that:
- Receives video frames via RDMA directly to GPU memory
- Supports multi-page buffering for continuous streaming
- Provides optional GPU-resident metadata processing
- Extends
BaseReceiverOpfrom the hololink library
Requirements#
- Holoscan SDK 4.0.0 or later
- Holoscan Sensor Bridge ("Hololink") library installed and available
- CUDA-capable GPU with GPUDirect RDMA support
- libibverbs (InfiniBand Verbs library)
- RoCE-capable NIC (e.g., ConnectX)
Usage#
C++#
#include <hsb_roce_receiver_nmd/roce_receiver_op.hpp>
// In your application setup:
auto receiver = make_operator<hololink::operators::RoceReceiverOp>(
"roce_receiver",
holoscan::Arg("hololink_channel", data_channel),
holoscan::Arg("device_start", start_fn),
holoscan::Arg("device_stop", stop_fn),
holoscan::Arg("frame_context", cuda_context),
holoscan::Arg("frame_size", frame_size),
holoscan::Arg("ibv_name", "roceP5p3s0f0"),
holoscan::Arg("ibv_port", 1u),
holoscan::Arg("pages", 2u),
holoscan::Arg("queue_size", 1u),
holoscan::Arg("skip_host_metadata", false)
);
// Add to your pipeline
add_flow(receiver, next_operator, ('output', 'input'));
Parameters#
| Parameter | Type | Default | Description |
|---|---|---|---|
hololink_channel |
DataChannel* | - | Pointer to Hololink DataChannel object |
device_start |
std::function |
- | Function called to start the device |
device_stop |
std::function |
- | Function called to stop the device |
frame_context |
CUcontext | - | CUDA context for frame memory |
frame_size |
size_t | - | Size of one frame in bytes |
ibv_name |
string | "roceP5p3s0f0" | InfiniBand Verbs device name |
ibv_port |
uint32_t | 1 | Port number of IBV device |
pages |
uint32_t | 2 | Number of pages for receiver memory |
queue_size |
uint32_t | 1 | Number of buffers that can be queued |
skip_host_metadata |
bool | false | Skip copying metadata to host |
trim |
bool | false | Trim output to bytes_written |
use_frame_ready_condition |
bool | true | Use async condition for frame readiness |
Output Metadata#
The operator publishes the following metadata with each frame:
received_frame_number: Sequence number of received framerx_write_requests: Total RDMA write requestsreceived_s,received_ns: Receive timestampimm_data: Immediate data from RDMA writeframe_memory: GPU device pointer to frame datadropped: Number of dropped framesframe_number: Frame number from devicetimestamp_s,timestamp_ns: Device timestampcrc: Frame CRC valuebytes_written: Actual bytes written
Variants#
RoceReceiverNoHostMetadata#
A variant that keeps metadata on the GPU and avoids host copies. Enable this with skip_host_metadata=true for GPU-resident processing pipelines.
Build#
This operator is built as part of the HoloHub build system:
./run build hsb_roce_receiver_nmd
Dependencies#
hololink::core- Core hololink libraryhololink::common- Common hololink utilitieshololink::operators::base_receiver- Base receiver operatorholoscan::core- Holoscan SDKibverbs- InfiniBand Verbs libraryCUDA- CUDA driver APIfmt- Formatting library