Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
2.0k views
in Technique[技术] by (71.8m points)

Use GPU on python docker image

I'm using a python:3.7.4-slim-buster docker image and I can't change it. I'm wondering how to use my nvidia gpus on it.

I usually used a tensorflow/tensorflow:1.14.0-gpu-py3 and with a simple --runtime=nvidia int the docker run command everything worked fine, but now I have this constraint.

I think that no shortcut exists on this type of image so I was following this guide https://towardsdatascience.com/how-to-properly-use-the-gpu-within-a-docker-container-4c699c78c6d1, building the Dockerfile it proposes:

FROM python:3.7.4-slim-buster

RUN apt-get update && apt-get install -y build-essential
RUN apt-get --purge remove -y nvidia*
ADD ./Downloads/nvidia_installers /tmp/nvidia                             > Get the install files you used to install CUDA and the NVIDIA drivers on your host
RUN /tmp/nvidia/NVIDIA-Linux-x86_64-331.62.run -s -N --no-kernel-module   > Install the driver.
RUN rm -rf /tmp/selfgz7                                                   > For some reason the driver installer left temp files when used during a docker build (i dont have any explanation why) and the CUDA installer will fail if there still there so we delete them.
RUN /tmp/nvidia/cuda-linux64-rel-6.0.37-18176142.run -noprompt            > CUDA driver installer.
RUN /tmp/nvidia/cuda-samples-linux-6.0.37-18176142.run -noprompt -cudaprefix=/usr/local/cuda-6.0   > CUDA samples comment if you dont want them.
RUN export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64         > Add CUDA library into your PATH
RUN touch /etc/ld.so.conf.d/cuda.conf                                     > Update the ld.so.conf.d directory
RUN rm -rf /temp/*  > Delete installer files.

But it raises an error:

ADD failed: stat /var/lib/docker/tmp/docker-builder080208872/Downloads/nvidia_installers: no such file or directory

What can I change to easily let the docker image see my gpus?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

TensorFlow image split into several 'partial' Dockerfiles. One of them contains all dependencies TensorFlow needs to operate on GPU. Using it you can easily create a custom image, you only need to change default python to whatever version you need. This seem to me a much easier job than bringing NVIDIA's stuff into Debian image (which AFAIK is not officially supported for CUDA and/or cuDNN).

Here's the Dockerfile:

# TensorFlow image base written by TensorFlow authors.
# Source: https://github.com/tensorflow/tensorflow/blob/v2.3.0/tensorflow/tools/dockerfiles/partials/ubuntu/nvidia.partial.Dockerfile
# -------------------------------------------------------------------------
ARG ARCH=
ARG CUDA=10.1
FROM nvidia/cuda${ARCH:+-$ARCH}:${CUDA}-base-ubuntu${UBUNTU_VERSION} as base
# ARCH and CUDA are specified again because the FROM directive resets ARGs
# (but their default value is retained if set previously)
ARG ARCH
ARG CUDA
ARG CUDNN=7.6.4.38-1
ARG CUDNN_MAJOR_VERSION=7
ARG LIB_DIR_PREFIX=x86_64
ARG LIBNVINFER=6.0.1-1
ARG LIBNVINFER_MAJOR_VERSION=6

# Needed for string substitution
SHELL ["/bin/bash", "-c"]
# Pick up some TF dependencies
RUN apt-get update && apt-get install -y --no-install-recommends 
        build-essential 
        cuda-command-line-tools-${CUDA/./-} 
        # There appears to be a regression in libcublas10=10.2.2.89-1 which
        # prevents cublas from initializing in TF. See
        # https://github.com/tensorflow/tensorflow/issues/9489#issuecomment-562394257
        libcublas10=10.2.1.243-1  
        cuda-nvrtc-${CUDA/./-} 
        cuda-cufft-${CUDA/./-} 
        cuda-curand-${CUDA/./-} 
        cuda-cusolver-${CUDA/./-} 
        cuda-cusparse-${CUDA/./-} 
        curl 
        libcudnn7=${CUDNN}+cuda${CUDA} 
        libfreetype6-dev 
        libhdf5-serial-dev 
        libzmq3-dev 
        pkg-config 
        software-properties-common 
        unzip

# Install TensorRT if not building for PowerPC
RUN [[ "${ARCH}" = "ppc64le" ]] || { apt-get update && 
        apt-get install -y --no-install-recommends libnvinfer${LIBNVINFER_MAJOR_VERSION}=${LIBNVINFER}+cuda${CUDA} 
        libnvinfer-plugin${LIBNVINFER_MAJOR_VERSION}=${LIBNVINFER}+cuda${CUDA} 
        && apt-get clean 
        && rm -rf /var/lib/apt/lists/*; }

# For CUDA profiling, TensorFlow requires CUPTI.
ENV LD_LIBRARY_PATH /usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/lib64:$LD_LIBRARY_PATH

# Link the libcuda stub to the location where tensorflow is searching for it and reconfigure
# dynamic linker run-time bindings
RUN ln -s /usr/local/cuda/lib64/stubs/libcuda.so /usr/local/cuda/lib64/stubs/libcuda.so.1 
    && echo "/usr/local/cuda/lib64/stubs" > /etc/ld.so.conf.d/z-cuda-stubs.conf 
    && ldconfig
# -------------------------------------------------------------------------
#
# Custom part
FROM base
ARG PYTHON_VERSION=3.7

RUN apt-get update && apt-get install -y --no-install-recommends --no-install-suggests 
          python${PYTHON_VERSION} 
          python3-pip 
          python${PYTHON_VERSION}-dev 
# Change default python
    && cd /usr/bin 
    && ln -sf python${PYTHON_VERSION}         python3 
    && ln -sf python${PYTHON_VERSION}m        python3m 
    && ln -sf python${PYTHON_VERSION}-config  python3-config 
    && ln -sf python${PYTHON_VERSION}m-config python3m-config 
    && ln -sf python3                         /usr/bin/python 
# Update pip and add common packages
    && python -m pip install --upgrade pip 
    && python -m pip install --upgrade 
        setuptools 
        wheel 
        six 
# Cleanup
    && apt-get clean 
    && rm -rf $HOME/.cache/pip

You can take from here: change python version to one you need (and which is available in Ubuntu repositories), add packages, code, etc.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...