Restoring Compatibility between Docker and GPUs

Overview

If you are trying to use Docker and the CircleCI GPU executor, you may get the following error.

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
ERRO[0407] error waiting for container: context canceled

This is due to the removal of the nvidia-container-toolkit when CircleCI Switched to using images with multiple CUDA versions available at runtime.

 

Solution

Step 1: Add a step in your config.yml to install `nvidia-container-toolkit` and Restart Docker

      - run: 
name: Install nvidia-container-toolkit and Restart Docker
command: |
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

 

Step 2: Verify Compatibility

To ensure that Docker and GPUs are working together, you can run a test container using the nvidia/cuda:11.4.3-base-ubuntu20.04 image. This container will execute the nvidia-smi command to display GPU information.

 - run: 
name: Test GPU Docker
command: docker run --gpus all nvidia/cuda:11.4.3-base-ubuntu20.04 nvidia-smi

 

If the nvidia-container-toolkit is functioning correctly and Docker can utilize the GPU resources, you should see the GPU information displayed.

 

Additional Resources

 

Was this article helpful?
1 out of 3 found this helpful

Comments

0 comments

Article is closed for comments.