Chat with NVIDIA NIM¶
Authors: Holoscan Team (NVIDIA)
Supported platforms: x86_64, aarch64
Last modified: March 18, 2025
Language: Python
Latest version: 1.0
Minimum Holoscan SDK version: 1.0.3
Tested Holoscan SDK versions: 1.0.3, 2.1.0
Contribution metric: Level 1 - Highly Reliable
This is a sample application that shows how to use the OpenAI SDK with NVIDIA Inference Microservice (NIM). Whether you are using a NIM from build.nvidia.com/ or a self-hosted NIM, this sample application will work for both.
Quick Start¶
- Add API key in
nvidia_nim.yaml
./dev_container build_and_run nvidia_nim_chat
Configuring the sample application¶
Use the nvidia_nim.yaml
configuration file to configure the sample application:
Connection Information¶
nim:
base_url: https://integrate.api.nvidia.com/v1
api_key:
base_url
: The URL of your NIM instance. Defaults to NVIDIA hosted NIMs.
api_key
: Your API key to access NVIDIA hosted NIMs.
Model Information¶
The models
section in the YAML file is configured with multiple NVIDIA hosted models by default. This allows you to switch between different models easily within the application by sending the prompt /m
to the application.
Model parameters may be added or adjusted in the models
section as well per model.
Run the sample application¶
There are a couple of options to run the sample application:
Run using Docker¶
To run the sample application with Docker, you must first build a Docker image that includes the sample application and its dependencies:
# Build the Docker images from the root directory of Holohub
./dev_container build --docker_file applications/nvidia_nim/Dockerfile
Then, run the Docker image:
./dev_container launch
Continue to the Start the Application section once inside the Docker container.
Run the Application without Docker¶
Install all dependencies from the requirements.txt
file:
# optionally create a virtual environment and activate it
python3 -m venv .venv
source .venv/bin/activate
# install the required packages
pip install -r applications/nvidia_nim/chat/requirements.txt
Start the Application¶
To use the NIMs on build.nvidia.com/, configure your API key in the nvidia_nim.yaml
configuration file and run the sample app as follows:
note: you may also configure your api key using an environment variable.
E.g., export API_KEY=...
# To use NVIDIA hosted NIMs available on build.nvidia.com, export your API key first
export API_KEY=[enter your api key here]
./run launch nvidia_nim_chat
Have fun!
Connecting with Locally Hosted NIMs¶
To use a locally hosted NIM, first download and start the NIM.
Then configure the base_url
parameter in the nvidia_nim.yaml
configuration file to point to your local NIM instance.
The following example shows a NIM running locally and serving its APIs and the meta-llama3-8b-instruct
model from http://0.0.0.0:8000/v1
.
nim:
base_url: http://0.0.0.0:8000/v1/
models:
llama3-8b-instruct:
model: meta-llama3-8b-instruct # name of the model serving by the NIM
# add/update/remove the following key/value pairs to configure the parameters for the model
top_p: 1
n: 1
max_tokens: 1024
frequency_penalty: 1.0