Skip to content

Adding a GUI to Holoscan Python Applications

Authors: Wendell Hom (NVIDIA)
Supported platforms: x86_64, aarch64
Last modified: March 18, 2025
Language: Python
Latest version: 0.1.0
Minimum Holoscan SDK version: 1.0.3
Tested Holoscan SDK versions: 2.3.0
Contribution metric: Level 1 - Highly Reliable

When developing Holoscan applications, incorporating a graphical user interface (GUI) can enhance usability and allow modification of the application's behavior at runtime.

This tutorial demonstrates how GUI controls were integrated into the Florence-2 Python application using PySide6. This addition enables users to dynamically change the vision task performed by the application.

Holoscan VILA Live

Table of Contents

  1. Overview
    1. Dockerfile
    2. Application Code
    3. GUI Code
  2. Creating the GUI Widgets and Layout
  3. Starting the Holoscan Application Thread
  4. Adding a GUI to Your Own Application

Overview

The Florence-2 application includes the typical components of a Holohub application, with the addition of a GUI component. The main components are:

  • Dockerfile: For installing additional dependencies.
  • Application Code: Defines the Holoscan application and its operators.
  • GUI Code: Utilizes PySide6 to add UI controls.

Dockerfile

The Dockerfile is used in Holohub when the application requires additional dependencies. For GUI functionality, this application's Dockerfile installs:

  • qt6-base-dev for Qt6 framework
  • PySide6 for Python bindings for Qt6 (as specified in requirements.txt)

Application Code

The Florence-2 application code is organized across several files:

  • florence2_app.py: Main application code.
  • florence2_op.py: Florence-2 model inference code.
  • florence2_postprocessor_op.py: Post-processing code to send overlays (e.g., bounding boxes, labels, segmentation masks) to Holoviz.
  • config.yaml: Default application parameters.

The Florence-2 application can be run independently of the GUI code. E.g., the application can be run with python application/florence-2-vision/florence2_app.py inside the Florence-2 Docker container. This will run the application without the GUI controls. The only code needed for GUI integration in the application code is the set_parameters() method in the FlorenceApp class. This method updates two fields in the Florence-2 operator:

class FlorenceApp(Application):
    def set_parameters(self, task, prompt):
        """Set parameters for the Florence2Operator."""
        if self.florence_op:
            self.florence_op.task = task
            self.florence_op.prompt = prompt

These updated parameters are passed to the model during the next compute() method execution of the Florence-2 operator.

GUI Code

The GUI code resides in qt_app.py. The code in this file defines a class for the main window which calls setupUi() and runHoloscanApp() when the instance is initialized.

class Window(QMainWindow):
    def __init__(self, parent=None):
        super().__init__(parent)
        self.setupUi()  # Setup the UI
        self.runHoloscanApp()  # Run the Holoscan application

At a high level, this is all we need to launch a Python Holoscan application with a GUI. The setupUi() method defines the GUI widgets and layout, while runHoloscanApp() runs the Florence-2 application in a separate thread within the process. Details of these methods are explored in the following sections.

Creating the GUI Widgets and Layout

The setupUi() method creates the GUI with a few simple widgets using PySide6 APIs. For those unfamiliar with PySide6, this tutorial provides an introduction.

    def setupUi(self):
        """Setup the UI components."""
        self.setWindowTitle("Florence-2")
        self.resize(400, 150)
        self.centralWidget = QWidget()
        self.setCentralWidget(self.centralWidget)

        layout = QVBoxLayout()

        # Create and add dropdown for task selection
        self.dropdown = QComboBox()
        self.dropdown.addItems(
            [
                "Object Detection",
                "Caption",
                "Detailed Caption",
                "More Detailed Caption",
                "Dense Region Caption",
                "Region Proposal",
                "Caption to Phrase Grounding",
                "Referring Expression Segmentation",
                "Open Vocabulary Detection",
                "OCR",
                "OCR with Region",
            ]
        )
        layout.addWidget(QLabel("Select an option:"))
        layout.addWidget(self.dropdown)

        # Create and add text input for prompt
        self.text_input = QLineEdit()
        layout.addWidget(QLabel("Enter text:"))
        layout.addWidget(self.text_input)

        # Create and add submit button
        self.submit_button = QPushButton("Submit")
        self.submit_button.clicked.connect(self.on_submit)
        layout.addWidget(self.submit_button)

        self.centralWidget.setLayout(layout)

This code creates the following widgets:

  • Drop-down Menu: Lists the vision tasks supported by Florence-2.
  • Text input Widget: Allows text input for tasks such as Open Vocabulary Detection.
  • Submit Button: Triggers the on_submit() method when clicked.

When the application is running, the user selects a vision task, enters text (if needed), and clicks "Submit" to change the task performed by the model. The on_submit() method is then invoked, calling the set_parameters() method in the FlorenceApp class to update the operator's parameters.

    def on_submit(self):
        """Handle the submit button click event."""
        selected_option = self.dropdown.currentText()
        entered_text = self.text_input.text()

        # Set parameters in the Holoscan application
        global gApp
        if gApp:
            gApp.set_parameters(selected_option, entered_text)

Starting the Holoscan Application Thread

The runHoloscanApp() method starts the Florence-2 application by creating an instance of FlorenceWorker and running it in a thread.

    def runHoloscanApp(self):
        """Run the Holoscan application in a separate thread."""
        self.thread = QThread()
        self.worker = FlorenceWorker()
        self.worker.moveToThread(self.thread)
        self.thread.started.connect(self.worker.run)
        self.worker.finished.connect(self.thread.quit)
        self.worker.finished.connect(self.worker.deleteLater)
        self.thread.finished.connect(self.thread.deleteLater)
        self.thread.start()

When the thread is started, it calls the FlorenceWorker class's run() method which creates and runs the Holoscan application.

# Worker class to run the Holoscan application in a separate thread
class FlorenceWorker(QObject):
    finished = Signal()  # Signal to indicate the worker has finished
    progress = Signal(int)  # Signal to indicate progress (if needed)

    def run(self):
        """Run the Holoscan application."""
        config_file = os.path.join(os.path.dirname(__file__), "config.yaml")
        global gApp
        gApp = app = FlorenceApp()
        app.config(config_file)
        app.run()

This covers the essential steps for creating a GUI to control your Python Holoscan applications. To try out the application, follow the instructions provided here.

Adding a GUI to Your Own Application

To integrate a GUI into your Python application using PySide6, follow these steps:

  1. Ensure Qt and PySide6 dependencies are included in your Dockerfile. Verify that Qt and PySide6 package licenses meet your project requirements.
  2. Copy the qt_app.py file to your application directory. Rename and modify the FlorenceWorker class to create an instance of your application. Update the import statement from florence2_app import FlorenceApp as necessary.
  3. Customize the setupUi() method to include the controls relevant to your application.
  4. Update set_parameters() methoed to reflect the parameters your application needs to update.