Docker + NVIDIA Setup for Hosting
Vast.ai and RunPod both deliver work to your machine as Docker containers with GPU access mapped through. That means three pieces have to be in place before anything else: an NVIDIA driver that recognizes your GPU, a working Docker installation, and the NVIDIA Container Toolkit that lets Docker see the GPU. Get one of them wrong and your host agent will either refuse to come online or happily come online and fail every rental.
This guide covers Ubuntu 22.04 and 24.04, which are the supported targets for most hosting platforms. Commands are copy-paste safe for a fresh install. If you are layering this onto an existing system, adapt accordingly.
Prerequisites
- A machine with a supported NVIDIA GPU (Pascal / GTX 10-series or newer as a practical floor)
- Ubuntu 22.04 LTS or 24.04 LTS, installed and updated
- A user with
sudoprivileges - An internet connection for package downloads
Before you begin, update the system:
A reboot clears any pending kernel updates before you install the driver, which avoids a category of "driver built against the wrong kernel" errors later.
Step 1: install the NVIDIA driver
The simplest path on Ubuntu is ubuntu-drivers, which picks a recommended driver version for your detected GPU:
After reboot, verify the driver is running and the GPU is visible:
You should see a table listing your GPU, driver version, CUDA version, and current utilization. If nvidia-smi prints "NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver", the driver did not install cleanly — see the troubleshooting section below.
Alternative: install a specific driver version
If ubuntu-drivers autoinstall picks a version you don't want, you can install a specific one. List what's available:
Then install by name, for example:
Match the driver version to what your hosting platform requires — Vast.ai publishes a minimum driver version in their host docs.
Step 2: install Docker CE
Use Docker's official apt repository rather than Ubuntu's older docker.io package — the upstream package is kept current and works cleanly with the NVIDIA Container Toolkit.
Remove any older Docker packages that might interfere:
Install prerequisites, add Docker's GPG key, and add the repo:
Verify Docker is running:
You should see the "Hello from Docker!" output. Optionally, add your user to the docker group to skip sudo on subsequent commands (log out and back in for group membership to take effect):
Step 3: install the NVIDIA Container Toolkit
This is the bridge that lets Docker containers see and use your GPU.
Configure Docker to use the NVIDIA runtime, then restart Docker:
Step 4: verify the full stack
This is the single most important test. If this works, your host is ready for a hosting agent to install on top:
You should see the same nvidia-smi output you got from the host, but this time running inside a container. If the container can see the GPU, Docker can hand it to Vast.ai or RunPod workloads.
12.0.0-base-ubuntu22.04) is a widely-available reference image. Newer CUDA versions work the same way — swap in whatever is current.
Common errors and fixes
"Failed to initialize NVML: Driver/library version mismatch"
This usually means you updated the driver package but didn't reboot, or you have mixed driver versions installed. Fix: sudo reboot. If the error persists after reboot, purge NVIDIA packages and reinstall:
"could not select device driver with capabilities: [[gpu]]"
Docker can't find the NVIDIA runtime. Either the Container Toolkit isn't installed, or nvidia-ctk runtime configure wasn't run. Re-run Step 3.
"docker: Error response from daemon: unknown or invalid runtime name: nvidia"
The /etc/docker/daemon.json file either doesn't exist or is missing the NVIDIA runtime entry. Running sudo nvidia-ctk runtime configure --runtime=docker writes a correct config. After that, restart Docker.
"cannot open shared object file: libcuda.so"
The container image is looking for CUDA libraries that the NVIDIA driver on the host provides. Almost always caused by a missing or outdated driver on the host. Re-run nvidia-smi on the host — if that fails too, the driver is the problem.
cgroup v2 quirks on Ubuntu 24.04
Ubuntu 24.04 uses cgroup v2 by default. This is generally fine with modern Docker and NVIDIA Container Toolkit versions, but if you see permission errors when containers try to access the GPU, make sure you are on a recent version of the toolkit (apt list --installed | grep nvidia-container-toolkit). Older versions pre-date cgroup v2 support.
Where to go from here
With the Docker + NVIDIA stack verified, you are ready to install the Vast.ai or RunPod host agent following their respective onboarding docs. Before you do, it's worth running your rig through our compatibility checker — it covers the non-software prerequisites (VRAM threshold, internet speed, storage capacity) that matter just as much.
Verify your rig meets the minimums
Software is one piece. The RigHost checker covers the rest — GPU class, VRAM, bandwidth, storage, OS. Run it in a browser or pipe the CLI version into your server.
Run the Compatibility Checker →