远程服务器docker使用GPU配置

在使用租借算力平台时可能会用到docker,近期遇到了docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].问题,因此记录一下解决方式

https://link.zhihu.com/?target=https%3A//docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

执行如下指令,也可以编写脚本执行

1
2
3
4
5
6
7
8
9
sudo curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
sudo curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
sudo apt-get update

sudo apt-get install nvidia-container-runtime
sudo systemctl restart docker

验证执行下列命令

1
which nvidia-container-runtime

输出 /usr/bin/nvidia-container-runtime,表示安装成功

参考

Docker容器中使用 Nvidia GPU - 知乎 (zhihu.com)