Holoscan Reference Applications

Home
Workflows
Workflows
- Real-Time AI End-to-End Surgical Video Workflow
  Real-Time AI End-to-End Surgical Video Workflow
Applications
Applications
- Advanced Networking Benchmark
  Advanced Networking Benchmark
- AJA Video Capture
  AJA Video Capture
- Basic Networking Ping
  Basic Networking Ping
- Basic Pulse Description Word (PDW) Generator
  Basic Pulse Description Word (PDW) Generator
- Body Pose Estimation
  Body Pose Estimation
- Colonoscopy Polyp Segmentation
  Colonoscopy Polyp Segmentation
- CUDA Quantum Variational Quantum Eigensolver (VQE)
  CUDA Quantum Variational Quantum Eigensolver (VQE)
- Dds
  Dds
  - DDS Video: Real-time Video Streaming with RTI Connext
    
    DDS Video: Real-time Video Streaming with RTI Connext
- Deltacast Videomaster Transmitter
  Deltacast Videomaster Transmitter
- Depth Anything V2
  Depth Anything V2
- Distributed
  Distributed
  - Grpc
    Grpc
    
    Distributed Endoscopy Tool Tracking with gRPC Streaming
    
    Distributed Endoscopy Tool Tracking with gRPC Streaming
    
    Distributed H.264 Endoscopy Tool Tracking with gRPC Streaming
    
    Distributed H.264 Endoscopy Tool Tracking with gRPC Streaming
  - Ucx
    Ucx
    
    Distributed H.264 Endoscopy Tool Tracking
    
    Distributed H.264 Endoscopy Tool Tracking
    
    Ucx endoscopy tool tracking
    Ucx endoscopy tool tracking
    
    UCX-based Distributed Endoscopy Tool Tracking (C++)
    
    UCX-based Distributed Endoscopy Tool Tracking (C++)
    
    UCX-based Distributed Endoscopy Tool Tracking (Python)
    
    UCX-based Distributed Endoscopy Tool Tracking (Python)
- Ehr query llm
  Ehr query llm
  - EHR Agent Framework
    
    EHR Agent Framework
  - FHIR Client for Retrieving and Posting FHIR Resources
    
    FHIR Client for Retrieving and Posting FHIR Resources
- Endoscopy Depth Estimation
  Endoscopy Depth Estimation
- Endoscopy out of body detection
  Endoscopy out of body detection
  - Endoscopy Out of Body Detection (C++)
    
    Endoscopy Out of Body Detection (C++)
  - Endoscopy Out of Body Detection (Python)
    
    Endoscopy Out of Body Detection (Python)
- Endoscopy Tool Segmentation from MONAI Model Zoo
  Endoscopy Tool Segmentation from MONAI Model Zoo
- Endoscopy tool tracking
  Endoscopy tool tracking
  - Endoscopy Tool Tracking (C++)
    
    Endoscopy Tool Tracking (C++)
  - Endoscopy Tool Tracking (Python)
    
    Endoscopy Tool Tracking (Python)
- Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
  Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
- FM Radio Automatic Speech Recognition
  FM Radio Automatic Speech Recognition
- GPU-Accelerated Orthorectification with NVIDIA OptiX
  GPU-Accelerated Orthorectification with NVIDIA OptiX
- H264
  H264
  - H.264 Endoscopy Tool Tracking
    
    H.264 Endoscopy Tool Tracking
  - H.264 Video Decode
    
    H.264 Video Decode
- High speed endoscopy
  High speed endoscopy
  - High-Speed Endoscopy (C++)
    
    High-Speed Endoscopy (C++)
  - High-Speed Endoscopy (Python)
    
    High-Speed Endoscopy (Python)
- HoloChat
  HoloChat
- Holoviz
  Holoviz
  - Holoviz HDR
    
    Holoviz HDR
  - Holoviz sRGB
    
    Holoviz sRGB
  - Holoviz UI
    
    Holoviz UI
  - Holoviz vsync
    
    Holoviz vsync
  - Holoviz YUV
    
    Holoviz YUV
- Hyperspectral Image Segmentation
  Hyperspectral Image Segmentation
- Imaging AI Whole Body Segmentation
  Imaging AI Whole Body Segmentation
- Intel RealSense Camera Visualizer
  Intel RealSense Camera Visualizer
- Isaac Holoscan Bridge
  Isaac Holoscan Bridge
- Laser detection latency
  Laser detection latency
  - EVT Camera Calibration
    
    EVT Camera Calibration
  - Laser Detection
    
    Laser Detection
  - USB Camera Calibration
    
    USB Camera Calibration
- Matlab gpu coder
  Matlab gpu coder
  - Image Processing with MATLAB GPU Coder
    
    Image Processing with MATLAB GPU Coder
  - Ultrasound Beamforming with MATLAB GPU Coder
    
    Ultrasound Beamforming with MATLAB GPU Coder
- Medical Image Viewer in XR
  Medical Image Viewer in XR
  - Operators
    Operators
    
    XrFrameOp
    
    XrFrameOp
    
    Convert Depth to Screen Space
    
    Convert Depth to Screen Space
    
    XrBeginFrameOp
    
    XrBeginFrameOp
    
    XrEndFrameOp
    
    XrEndFrameOp
    
    XrTransformOp
    
    XrTransformOp
    
    XrTransformControlOp
    
    XrTransformControlOp
    
    XrTransformRenderOp
    
    XrTransformRenderOp
  - Utils
    Utils
    
    XR Demo
    
    XR Demo
    
    XR Basic Rendering Operator
    
    XR Basic Rendering Operator
- Multi AI SSD Detection and MONAI Endoscopic Tool Segmentation
  Multi AI SSD Detection and MONAI Endoscopic Tool Segmentation
- Multiai ultrasound
  Multiai ultrasound
  - Multi-AI Ultrasound (C++)
    
    Multi-AI Ultrasound (C++)
  - Multi-AI Ultrasound (Python)
    
    Multi-AI Ultrasound (Python)
  - Operators
    Operators
    
    visualizer_icardio
    
    visualizer_icardio
- Nvidia nim
  Nvidia nim
  - Chat with NVIDIA NIM
    
    Chat with NVIDIA NIM
  - Medical Imaging Segmentation with NVIDIA Vista-3D NIM
    
    Medical Imaging Segmentation with NVIDIA Vista-3D NIM
  - NVIDIA NV-CLIP NIM
    
    NVIDIA NV-CLIP NIM
- Nvidia video codec
  Nvidia video codec
  - Nvc decode
    Nvc decode
    
    NVIDIA Video Codec: H.264 File Decoder
    
    NVIDIA Video Codec: H.264 File Decoder
  - Nvc encode decode
    Nvc encode decode
    
    NVIDIA Video Codec: Encode-Decode Video
    
    NVIDIA Video Codec: Encode-Decode Video
  - Nvc encode writer
    Nvc encode writer
    
    NVIDIA Video Codec: Video Writer
    
    NVIDIA Video Codec: Video Writer
- Object Detection using PyTorch Faster R-CNN
  Object Detection using PyTorch Faster R-CNN
- OpenIGTLink 3D Slicer: Bidirectional Video Streaming with AI Segmentation
  OpenIGTLink 3D Slicer: Bidirectional Video Streaming with AI Segmentation
- Orsi
  Orsi
  - Orsi Academy In-Out Body Detection and Surgical Video Anonymization
    
    Orsi Academy In-Out Body Detection and Surgical Video Anonymization
  - Orsi Academy Multi AI and AR Visualization
    
    Orsi Academy Multi AI and AR Visualization
  - Orsi Academy Surgical Tool Segmentation and AR Overlay
    
    Orsi Academy Surgical Tool Segmentation and AR Overlay
- Polyp Detection
  Polyp Detection
- Power Spectral Density with cuNumeric
  Power Spectral Density with cuNumeric
- ProHawk Video Replayer
  ProHawk Video Replayer
- PVA-Accelerated Image Sharpening
  PVA-Accelerated Image Sharpening
- Qt Video Replayer
  Qt Video Replayer
- Radar Signal Processing over Network
  Radar Signal Processing over Network
- Real-Time Face and Text Deidentification
  Real-Time Face and Text Deidentification
- Real-time Riva ASR to local-LLM
  Real-time Riva ASR to local-LLM
- SAM 2: Segment Anything in Images and Videos
  SAM 2: Segment Anything in Images and Videos
- Simple CV-CUDA
  Simple CV-CUDA
- Simple radar pipeline
  Simple radar pipeline
  - Simple Radar Pipeline (C++)
    
    Simple Radar Pipeline (C++)
  - Simple Radar Pipeline (Python)
    
    Simple Radar Pipeline (Python)
- Slang
  Slang
  - Slang Simple Compute Kernel Example
    
    Slang Simple Compute Kernel Example
- Software Defined Radio FM Demodulation
  Software Defined Radio FM Demodulation
- Speech-to-text + Large Language Model
  Speech-to-text + Large Language Model
- SSD Detection for Endoscopy Tools
  SSD Detection for Endoscopy Tools
- Stereo Vision
  Stereo Vision
- Streaming Synthetic Aperture Radar
  Streaming Synthetic Aperture Radar
- TAO PeopleNet Detection Model on V4L2 Video Stream
  TAO PeopleNet Detection Model on V4L2 Video Stream
- Ultrasound segmentation
  Ultrasound segmentation
  - Ultrasound Bone Scoliosis Segmentation (C++)
    
    Ultrasound Bone Scoliosis Segmentation (C++)
  - Ultrasound Bone Scoliosis Segmentation (Python)
    
    Ultrasound Bone Scoliosis Segmentation (Python)
- Velodyne VLP-16 Lidar Viewer
  Velodyne VLP-16 Lidar Viewer
- VILA Live
  VILA Live
- VITA 49 Power Spectral Density (PSD)
  VITA 49 Power Spectral Density (PSD)
  - data_writer
    
    data_writer
- Volume rendering using ClaraViz
  Volume rendering using ClaraViz
- VPI Stereo Vision
  VPI Stereo Vision
- WebRTC Holoviz Server
  WebRTC Holoviz Server
- WebRTC Video Client
  WebRTC Video Client
- WebRTC Video Server
  WebRTC Video Server
- XR + Gaussian Splatting
  XR + Gaussian Splatting
- XR + Holoviz
  XR + Holoviz
- Yolo Object Detection
  Yolo Object Detection
Operators
Operators
- Advanced Network Operators
  Advanced Network Operators
- AJASourceOp
  AJASourceOp
- ApriltagDetectorOp
  ApriltagDetectorOp
- BasicNetworkOp Tx/Rx
  BasicNetworkOp Tx/Rx
- Dds
  Dds
  - DDSOperatorBase
    
    DDSOperatorBase
  - DDSShapesSubscriberOp
    
    DDSShapesSubscriberOp
  - DDSVideoSubscriberOp
    
    DDSVideoSubscriberOp
- Deidentification
  Deidentification
  - PixelatorOp
    
    PixelatorOp
- Deltacast VideoMaster Operators
  Deltacast VideoMaster Operators
- EHR Query LLM
  EHR Query LLM
  - FhirClientOperator
    
    FhirClientOperator
  - FhirResourceSanitizerOp
    
    FhirResourceSanitizerOp
  - ZeroMQPublisherOp
    
    ZeroMQPublisherOp
  - ZeroMQSubscriberOp
    
    ZeroMQSubscriberOp
- EmergentSourceOp
  EmergentSourceOp
- FFT
  FFT
- GRPC Operators
  GRPC Operators
- HighRatePSD
  HighRatePSD
- Holoscan CvCuda Interop Ops
  Holoscan CvCuda Interop Ops
- LowRatePSD
  LowRatePSD
- LSTMTensorRTInferenceOp
  LSTMTensorRTInferenceOp
- Medical Imaging
  Medical Imaging
  - ClaraVizOperator
    
    ClaraVizOperator
  - DICOMDataLoaderOperator
    
    DICOMDataLoaderOperator
  - DICOMEncapsulatedPDFWriterOperator
    
    DICOMEncapsulatedPDFWriterOperator
  - DICOMSegmentationWriterOperator
    
    DICOMSegmentationWriterOperator
  - DICOMSeriesSelectorOperator
    
    DICOMSeriesSelectorOperator
  - DICOMSeriesToVolumeOperator
    
    DICOMSeriesToVolumeOperator
  - DICOMTextSRWriterOperator
    
    DICOMTextSRWriterOperator
  - InferenceOperator
    
    InferenceOperator
  - MonaiBundleInferenceOperator
    
    MonaiBundleInferenceOperator
  - MonaiSegInferenceOperator
    
    MonaiSegInferenceOperator
  - NiftiDataLoader
    
    NiftiDataLoader
  - PNGConverterOperator
    
    PNGConverterOperator
  - PublisherOperator
    
    PublisherOperator
  - STLConversionOperator
    
    STLConversionOperator
- NppFilterOp
  NppFilterOp
- NVIDIA Video Codec SDK: Decoder
  NVIDIA Video Codec SDK: Decoder
- OpenIGTLink Tx/Rx
  OpenIGTLink Tx/Rx
- Orsi
  Orsi
  - FormatConverterOp
    
    FormatConverterOp
  - OrsiVisualizationOp
    
    OrsiVisualizationOp
  - SegmentationPostprocessorOp
    
    SegmentationPostprocessorOp
  - SegmentationPreprocessorOp
    
    SegmentationPreprocessorOp
- ProhawkOp
  ProhawkOp
- QCAPSourceOp
  QCAPSourceOp
- QtVideoOp
  QtVideoOp
- RealsenseCameraOp
  RealsenseCameraOp
- SendMeshToUSDOp
  SendMeshToUSDOp
- Slang shading language operator
  Slang shading language operator
- Tensor to File
  Tensor to File
- TensorToVideoBufferOp
  TensorToVideoBufferOp
- ToolTrackingPostprocessorOp
  ToolTrackingPostprocessorOp
- UnzipOp
  UnzipOp
- V49PsdPacketizer
  V49PsdPacketizer
- Velodyne lidar
  Velodyne lidar
  - VelodyneLidarOp
    
    VelodyneLidarOp
- Video encoder
  Video encoder
  - VideoEncoderRequestOp
    
    VideoEncoderRequestOp
- VolumeLoaderOp
  VolumeLoaderOp
- VolumeRendererOp
  VolumeRendererOp
- VtkRendererOp
  VtkRendererOp
- WebRTCClientOp
  WebRTCClientOp
- WebRTCServerOp
  WebRTCServerOp
- XR Operators
  XR Operators
Tutorials
Tutorials
- Achieving High Performance Networking with Holoscan
  Achieving High Performance Networking with Holoscan
  - 0.1
- Adding a GUI to Holoscan Python Applications
  Adding a GUI to Holoscan Python Applications
- Creating Multi-Node Holoscan Applications
  Creating Multi-Node Holoscan Applications
- CUDA MPS Tutorial
  CUDA MPS Tutorial
- Debugging
  Debugging
  - Interactively Debugging a Holoscan Application
    
    Interactively Debugging a Holoscan Application
  - VSCode Dev Container for Holoscan
    
    VSCode Dev Container for Holoscan
- Deploying Llama-2 70b model on the edge with IGX Orin
  Deploying Llama-2 70b model on the edge with IGX Orin
- DICOM to OpenUSD mesh segmentation with MONAI Deploy and Holoscan
  DICOM to OpenUSD mesh segmentation with MONAI Deploy and Holoscan
- GPU Direct Storage on IGX
  GPU Direct Storage on IGX
- Holoscan Playground on AWS
  Holoscan Playground on AWS
- Holoscan SDK Response Time Analysis
  Holoscan SDK Response Time Analysis
- Integrate External Libraries into a Holoscan Pipeline
  Integrate External Libraries into a Holoscan Pipeline
- Interoperability between Holoscan and a Windows Application on a Single Machine
  Interoperability between Holoscan and a Windows Application on a Single Machine
- NVIDIA Holoscan Bootcamp lab materials
  NVIDIA Holoscan Bootcamp lab materials
- Pretrained foundational models
  Pretrained foundational models
  - Self-Supervised Contrastive Learning for Surgical videos
    
    Self-Supervised Contrastive Learning for Surgical videos
Benchmarks
Benchmarks
- Exclusive Display Benchmark
  Exclusive Display Benchmark
- Holoscan Flow Benchmarking
  Holoscan Flow Benchmarking
- Holoscan Release Benchmarking
  Holoscan Release Benchmarking
- Model Benchmarking
  Model Benchmarking

SAM 2: Segment Anything in Images and Videos #

Authors: Holoscan Team (NVIDIA)
Supported platforms: x86_64, aarch64
Language: Python
Last modified: May 13, 2025
Latest version: 1.0.0
Minimum Holoscan SDK version: 2.0.0
Tested Holoscan SDK versions: 2.0.0
Contribution metric: Level 1 - Highly Reliable

This application demonstrates how to run SAM2 models on live video feed with the possibility of changing query points in real-time.

Holohub Sam2

The application currently uses a single query point as a foreground point that moves on the perimeter of a circle with a configured angular speed. The models returns three masks, the best mask is selected based on the model scores. For visualization, two options exist. Select between "logits" or "masks". - "logits": predictions of the network, mapped onto a colorscale that matches matplotlib.pyplot's "viridis" - "masks": binarized predictions

SAM2, recently announced by Meta, is the next iteration of the Segment Anything Model (SAM). This new version expands upon its predecessor by adding the capability to segment both videos and images. This sample application wraps the ImageInference class, and applies it on a live video feed.

Note: This demo currently uses "sam2_hiera_l.yaml", but any of the sam2 models work. You only need to adjust segment_one_thing.yaml.

⚙️ Setup Instructions#

The app defaults to using the video device at /dev/video0

To debug if this is the correct device download v4l2-ctl:

sudo apt-get install v4l-utils

To check for your devices run:

v4l2-ctl --list-devices

This command will output something similar to this:

NVIDIA Tegra Video Input Device (platform:tegra-camrtc-ca):
        /dev/media0

vi-output, lt6911uxc 2-0056 (platform:tegra-capture-vi:0):
        /dev/video0

Dummy video device (0x0000) (platform:v4l2loopback-000):
        /dev/video3

Determine your desired video device and edit the source device in segment_one_thing.yaml

🚀 Build and Run Instructions#

ARM64 and x86#

OBS: If you are building on a Clara AGX Dev Kit, replace the Dockerfile below with ./alternative_docker/Dockerfile_cagx.

This application uses a custom Dockerfile based on a pytorch container. Build and run the application using

 ./holohub run sam2

Or first build the container, then launch it and run.

 ./holohub build-container sam2

./holohub run-container sam2 --no-docker-build

./holohub run sam2 --local --no-local-build

x86 only#

If you are only using an x86 system, you may use a Dockerfile based on the Holoscan container. Replace the Dockerfile with this alternative Dockerfile. Then, from the Holohub main directory run the following command:

./holohub run sam2

Alternatively build and run:

./holohub vscode sam2

Run the application in debug mode from vscode, or execute it by

python applications/sam2/segment_one_thing.py

You can choose to output "logits" or "masks" in the configuration of the postprocessor and holoviz operator segment_one_thing.yaml

💻 Supported Hardware#

x86 w/ dGPU
IGX Dev Kit w/ dGPU
Clara AGX Dev Kit w/ dGPU

🙌 Acknowledgements#

Meta, SAM2: for providing these models and inference infrastructure

SAM 2: Segment Anything in Images and Videos#