Machines are high-performing computing for scaling AI applications.
NVLink is required to accelerate your training workloads on A100-80Gx8 machines on Ubuntu 22.04 machines. For any other machines using Ubuntu 22.04 such as H100x1 and A100-80Gx1 machines, you need to disable NVLink to ensure CUDA runs.
To disable NVLink, first disable it at the system level and then in the virtual machine’s (VM) RAM disk.
Open your GRUB configuration file for editing using the following command:
sudo nano /etc/default/grub
Update GRUB_CMDLINE_LINUX_DEFAULT
to the following:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash nvlink.disable=1"
Complete the update with the following command:
sudo update-grub
You must reboot your system for the changes to take effect.
sudo reboot
You need to update your RAM disk to ensure NVLink is disabled at start up.
To disable NVLink, modify the initramfs
configurations using one of the following options:
Create a denylist for NVIDIA NVLink Modules
Create a file in /etc/modprobe.d/
where you denylist the NVIDIA NVLink modules. This does not change initramfs
but makes initramfs
respect your denylist.
echo "blacklist <module_name>" | sudo tee /etc/modprobe.d/nvlink-denylist.conf
Create a custom script in /etc/initramfs-tools/
to disable NVLink
Create a custom shell script, /new-init/disable-nvlink.sh
for example.
sudo nano /etc/initramfs-tools/scripts/new-init/disable_nvlink.sh
Add commands that disable NVLink for your RAM disk. Here is an example disable_nvlink.sh
script that disables specific NVIDIA kernel modules.
#!/bin/sh
modprobe -r nvidia_nvlink
modprobe -r nvidia_uvm
Ensure that the script is executable by running the following command:
sudo chmod +x /etc/initramfs-tools/scripts/new-init/disable_nvlink.sh
Rebuild initrd
for your currently running kernel using the following command:
sudo update-initramfs -u
To ensure NVLink is disabled across all versions of the Linux kernel currently installed on your system, you may want to update all installed kernels. This includes multi-boot environments, systems with multiple kernel versions for testing or compatibility reasons, or maintaining consistency across all available boot options. To update all installed kernels, use the following command:
sudo update-initramfs -c -k all
Lastly, reboot your system.
sudo reboot