cuda多版本问题

来源：1-9 Google_cloud_gpu_tensorflow配置

UN_Helium

2019-07-17

老师，我在conda base里安装了tf1.x gpu版，现在我创建了一个tf2.0的conda env，在里面再照着你视频里演示的安装方法，会对我的tf1.x gpu环境产生影响吗？

写回答

3回答

UN_Helium

提问者

2019-07-18

>>> tf.test.is_gpu_available()
2019-07-18 11:24:27.219720: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-07-18 11:24:27.255866: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3399825000 Hz
2019-07-18 11:24:27.256476: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x555c9dd55e80 executing computations on platform Host. Devices:
2019-07-18 11:24:27.256512: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2019-07-18 11:24:27.257622: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-07-18 11:24:27.262431: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2019-07-18 11:24:27.262472: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: pop-os
2019-07-18 11:24:27.262485: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: pop-os
2019-07-18 11:24:27.262543: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: 430.34.0
2019-07-18 11:24:27.262574: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 418.56.0
2019-07-18 11:24:27.262591: E tensorflow/stream_executor/cuda/cuda_diagnostics.cc:313] kernel version 418.56.0 does not match DSO version 430.34.0 -- cannot find working devices in this configuration

出现这样的信息，该怎么解决

UN_Helium

提问者

2019-07-18

这是Tensorflow GPU支持里的指导。

Ubuntu 18.04 (CUDA 10)

 # Add NVIDIA package repositories
    wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
    sudo dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
    sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
    sudo apt-get update
    wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
    sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
    sudo apt-get update
    # Install NVIDIA driver
    sudo apt-get install --no-install-recommends nvidia-driver-410
    # Reboot. Check that GPUs are visible using the command: nvidia-smi

上面的部分已经在1.x版本安装的时候安装在base环境里了。

如果在(虚拟环境2）中继续下列命令，会对base环境有影响吗？

 # Install development and runtime libraries (~4GB)
    sudo apt-get install --no-install-recommends \
        cuda-10-0 \
        libcudnn7=7.4.1.5-1+cuda10.0  \
        libcudnn7-dev=7.4.1.5-1+cuda10.0
    
    # Install TensorRT. Requires that libcudnn7 is installed above.
    sudo apt-get update && \
            sudo apt-get install nvinfer-runtime-trt-repo-ubuntu1804-5.0.2-ga-cuda10.0 \
            && sudo apt-get update \
            && sudo apt-get install -y --no-install-recommends libnvinfer-dev=5.0.2-1+cuda10.0

正十七

虚拟环境只是虚拟了python环境，如果改动gpu的话应该是全局的。

2019-07-21

共1条回复