CUDA driver version is insufficient for CUDA runtime version


#1

Hello all,

trying to install Mxnet following precisely these instructions https://mxnet.incubator.apache.org/install/
for
Mac OS 10.13.4
Dual GPU -> Intel Iris Pro and NVIDIA GeForce GT 750M (CUDA compatible)
Python
Build from source

I could follow the instructions without any problems.

I could install CUDA 9.1 driver and tool kit. But the Cuddn installation was for CUDA 9.0 or CUDA 9.2, no files for 9.1 (but all version 7.1.4 ), so i went for CUDA 9.2

When i try to run the test command at the end (python example/image-classification/train_mnist.py --network lenet --gpus 0), i get:
src/storage/storage.cc:119: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading CUDA: CUDA driver version is insufficient for CUDA runtime version

This is in relation to libmxnet.so. I can provide more info on the dump if needed.

After installation I had CUDA Driver Version: 387.128 but gives me error message above. Nvidia console on startup tells me to update to 396.64 , which i did, but error message did not change, even after reboot.

I’ve searched around for this error message, there is a lot of info, but not in relation to Mxnet, so i prefer to receive guidance here before changing anything.

I think i’m almost there.

thanks


#2
  • Try installing CUDNN for Cuda 9.0 rather than 9.2, since you said you are running with CUDA 9.1, it wouldn’t be compatible with your version of CUDA.
  • Rebuild from source

#3

Thanks Thomas,

so i went to fetch the 9.1 drivers here:
https://developer.nvidia.com/rdp/cudnn-download

But unlike for 9.2 where they provide an installation file for Mac, for 9.1 i can only find a zipped file for Linux. So i unzipped it and replaced the files as per these instructions:
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
$ sudo cp cuda/lib/libcudnn* /usr/local/cuda/lib

Now i closed and re-started my terminal, but no change, same error message.

I see the cudnn.h file in other places on my hard disk, one of it is in:
developer/NVIDIA/CUDA-9.1/INCLUDE
wonder if i also need to overwrite this one with the file i just downloaded.
I must confess i’m getting confused as to what is installed where and what is running in the end.

Then do i need to re.start my computer ?
Or do i need to run MAKE again ?


#4

Unfortunately it is hard to tell as I am not sure which steps you have followed and what is the current state of your installation. I think it could be a good idea to replace your cudnn library in your cuda install as it might be the previous one you installed.

In all cases, try a make clean and try to then rebuild. Can you also share which build flags you are using?

Purge your system from your old CUDA installs to only keep the one you want to use.


#5

Thanks Thomas,

i will uninstall everything carefully tonight and re-make it.

I’m using these flags:
USE_CUDA = 1
USE_CUDA_PATH = /usr/local/cuda
USE_CUDNN = 1
USE_OPENCV = 1

What I’m missing is to understand what MXnet is actually doing, not in great details, but just to understand what i’m doing. Is it re-compiling a version of Python ? Because i need to end up with Python 3.6 to run the fast ai deep learning environment on my Mac (http://course.fast.ai/lessons/lesson1.html).