Why Mount Custom Profiler Versions?
The Problem
Container images (like NVIDIA NeMo, PyTorch NGC, etc.) ship with pre-installed versions of Nsight Systems (nsys) and CUPTI libraries. However, these bundled versions may contain bugs or lack features needed for your specific profiling requirements.
✗ Container's Built-in Version
- • May contain known bugs
- • Cannot be updated without rebuilding image
- • Version locked to container release
- • Missing latest profiling features
✓ Mounted Custom Version
- • Use any version you need
- • Quickly swap versions without rebuilding
- • Apply bug fixes immediately
- • Test new profiler features
Architecture Overview
Host System
/opt/tools/nsys/2025.5.1/
/opt/tools/cupti/13.0.85/
Container
/usr/local/.../nsys
Host version active!
The mount overlays the container's built-in version with your custom version from the host
Installing Nsight Systems
Download the Installer
Download from NVIDIA Developer portal
Run the Installer
Install to a custom directory (non-interactive mode)
Verify Installation
Check that nsys binary exists and works
Installation Directory Structure
Installing CUPTI Library
When to Install CUPTI Separately?
CUPTI (CUDA Profiling Tools Interface) is included with Nsight Systems. Install it separately only when you need a specific CUPTI version that differs from what's in your nsys installation or container.
Download CUPTI Archive
From NVIDIA CUDA redistributables
Extract to Installation Directory
Extract and strip top-level directory
Container Mounting Strategy
Key Concept: Mount Over Container Path
To replace the container's built-in nsys, you mount your host directory over the exact path where nsys is installed inside the container. This "shadows" the original installation.
Finding the Container's Nsys Path
Common Container Nsys Paths
| Container Image | Nsys Install Path |
|---|---|
nvcr.io/nvidia/nemo:25.07.01 |
/usr/local/cuda-12.9/NsightSystems-cli-2025.1.1 |
nvcr.io/nvidia/nemo:25.09.00 |
/usr/local/cuda-12.9/NsightSystems-cli-2025.4.1 |
nvcr.io/nvidia/pytorch:xx.xx |
Check with |
Enroot/Pyxis Configuration (SLURM)
Enroot is commonly used with SLURM clusters and Pyxis plugin for containerized HPC workloads.
Environment Variable Approach
Many job launchers support passing mounts via environment variables:
Docker/Podman Configuration
Docker Compose Example
Verification
Version Mismatch Check
If nsys --version shows the container's original version, the mount didn't work. Common causes:
- • Incorrect container path (check
which nsysin unmodified container) - • Mount syntax error
- • Host path doesn't exist or isn't accessible
Troubleshooting
Error: "nsys: error while loading shared libraries"
The mounted nsys can't find required libraries.
Solution: Mount the entire nsys installation directory, not just the binary. The directory contains required libraries and dependencies.
Error: "CUPTI_ERROR_INSUFFICIENT_PRIVILEGES"
Profiling requires elevated privileges.
Solution: Run with --privileged or set --cap-add=SYS_ADMIN in Docker. For SLURM, ensure node configuration allows profiling.
Error: Architecture mismatch
x86_64 nsys on ARM64 container (or vice versa).
Solution: Download the correct architecture version. Use sbsa for ARM64, linux-public for x86_64.
Tip: Preserve Original as Fallback
Keep your mount configuration modular so you can quickly disable it if issues arise. Use environment variables or config files to toggle mounts.
Quick Reference
Nsys Download URLs
- Base URL:
developer.download.nvidia.com/devtools/nsight-systems/ - x86_64:
NsightSystems-linux-public-{version}.run - ARM64:
NsightSystems-linux-sbsa-public-{version}.run
CUPTI Download URLs
- Base:
developer.download.nvidia.com/compute/cuda/redist/cuda_cupti/ - x86_64:
linux-x86_64/cuda_cupti-linux-x86_64-{version}-archive.tar.xz - ARM64:
linux-sbsa/cuda_cupti-linux-sbsa-{version}-archive.tar.xz