A Study using Asynchronous Lock-free Buffer with SCHED_DEADLINE#
Authors: Holoscan Team (NVIDIA)
Supported platforms: x86_64, aarch64
Last modified: November 25, 2025
Latest version: 1.0.0
Minimum Holoscan SDK version: 3.5.0
Tested Holoscan SDK versions: 3.5.0
Contribution metric: Level 3 - Developmental
This tutorial demonstrates the impact of using an
asynchronous lock-free buffer
with
SCHED_DEADLINE scheduling policy in Linux on the message latency in a Holoscan
SDK application and compares it with the default buffer.
Application Configuration#
The application source code is provided in the application directory.
The application consists of two PingTxOp operators (tx1 and tx2) and one PingRxOp operator (rx). Both tx1 and tx2 generate messages and send them to rx.
graph TD;
tx1("tx1") -->|out -> in1| rx("rx");
tx2("tx2") -->|out -> in2| rx("rx");
tx1: Sends messages after busy-waiting for 5ms. The busy-wait is representative of reading sensor data and a brief processing time.tx2: Sends messages after busy-waiting for 10ms. The busy-wait is representative of reading sensor data and a brief processing time.rx: Receives messages from bothtx1andtx2, and calculates message latency and inter-message interval period. Then, it waits for a brief 1 ms which could be representative of acutating a signal after a brief processing time.
The connection between the operators is configured to use either a default
buffer (DoubleBuffer) or an async lock-free buffer.
Experiment Run Instructions#
Application-specific run instructions are provided in the application directory.
The following is an example application run instruction:
./holohub run async_buffer_deadline_tutorial --as-root --docker-opts='--ulimit rtprio=99 --cap-add=CAP_SYS_NICE'
We need to run the application with root privileges and other flags as
SCHED_DEADLINE Linux scheduling policy requires those flags.
Experiment Scripts#
To run all the experiments in this tutorial:
./tutorials/async_buffer_deadline_tutorial/run_experiment.sh
The above script will run the experiment and generate the plots in
period_variation under this directory.
Experimental Setup#
The experiment demonstrates how an async lock-free buffer can allow Holoscan SDK
operators to be run with different SCHED_DEADLINE periods independently
without being affected by each other's periods or runtimes.
We measure two key metrics:
- Max Message Latency: The highest latency observed for messages being
generated at tx1 (or at tx2) and then received at rx.
- Max Message Interval: The longest time interval between two consecutive
messages from tx1 (or from tx2) and then received at rx.
We run two main scenarios with fixed rx period of 10ms:
-
Fixed
tx1Period, Varyingtx2Period:tx1period is fixed at 20ms.tx2period is varied from 20ms to 100ms.- We measure the impact on
tx1's latency and message interval. - Since the periods
tx1andrxare fixed, the message timings oftx1must ideally not be impacted with varyingtx2periods.
-
Fixed
tx2Period, Varyingtx1Period:tx2period is fixed at 20ms.tx1period is varied from 20ms to 100ms.- We measure the impact on
tx2's latency and message interval. - Since the periods
tx2andrxare fixed, the message timings oftx2must ideally not be impacted with varyingtx1periods.
Results#
The results show that with the default buffer, the performance of one operator is heavily dependent on the other. However, with the async lock-free buffer, they are decoupled enabling true asynchronous execution of the operators.
TX1 Message Latency vs. TX2 Period#

With the default buffer, as tx2's period increases, tx1's maximum latency
also increases linearly. Since rx cannot run before both the upstream
operators (tx1 and tx2) have written messages for it, the latency of tx1
is affected by tx2. With the async lock-free buffer, tx1's latency
remains consistent and low, regardless of tx2's period. Therefore, async
lock-free buffer unlocks independent connection between tx1 and rx
irrespective of the behavior of tx2.
IN1 Message Interval vs. TX2 Period#

Similarly, the maximum message interval for tx1 (at the in1 port of rx)
increases with tx2's period when using the default buffer.
The async lock-free buffer keeps the interval stable.
TX2 Message Latency vs. TX1 Period#

The same trend is visible here. tx2's message latency is affected by tx1's period with the default buffer, but not with the async lock-free buffer.
IN2 Message Interval vs. TX1 Period#

The message interval for tx2 (at the in2 port of rx) remains stable with the async lock-free buffer, independent of tx1's period.
Conclusion and Guidance#
The asynchronous lock-free buffer connection between operators enables true
independent execution when using SCHED_DEADLINE Linux scheduling policy.
This buffer type allows each operator to maintain its specified runtime and
period without interference from other operators in the pipeline.
Key Insights:
- With async lock-free buffer: Operators run independently with their
configured SCHED_DEADLINE runtime and periods, achieving predictable real-time performance
- With default buffer (DoubleBuffer): Operators become coupled, where one
operator's performance can be impacted by another operator's
behavior, even when SCHED_DEADLINE policy is applied
Developer Recommendations:
1. Use async lock-free buffers when implementing soft real-time applications
with SCHED_DEADLINE scheduling to ensure predictable operator performance
2. Avoid default buffers in SCHED_DEADLINE scenarios where operator
independence is critical for meeting real-time constraints
3. Using default buffer with SCHED_DEADLINE policy means satisfying both
the constraints of the double buffer and periodic execution of
SCHED_DEADLINE. Depending on the application, this may not be desirable.
4. Using SCHED_DEADLINE for a chosen few operators (for example, source
operators in a DAG) along with default
buffer may provide a good balance because this provides predictable execution
for chosen few SCHED_DEADLINE operators while allowing applications to run
normally otherwise.
3. Test both buffer types during development to understand the performance
implications in your specific use case, especially when using SCHED_DEADLINE
policy
4. Monitor message latency and intervals to verify that operators maintain
their intended timing characteristics