Running MxNet compiled with Cuda 9 on a machine with Cuda 10

gpu
installation
#1

Hi,

I’m trying to run MxNet compiled for Cuda 9 on a machine with only Cuda 10.
For other Cuda applications, this seems to be possible as long as I specify the correct target GPU generation and do static linking of runtime into the application when I run ‘nvcc’ (at the cost of having a much fatter binary).

But does this hold true for MxNet as well? Is there anything about MxNet that makes it not runnable on a machine with a more recent Cuda version?

I tried running MxNet for Cuda 9 on Cuda 10 machine but it keeps looking for libcudart.so.9.0 .

Thanks!

#2

Why don’t you install mxnet-cu100?
If you are running on Linux, you can do a ldd libmxnet.so. This will show you which libraries it requires. For instance in the following case libmxnet requires libcudart.so.9.0. If it cannot find this library with it will just crash.

ldd /home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmxnet.so
        libcudart.so.9.0 => /usr/local/cuda/lib64/libcudart.so.9.0 
        libcublas.so.9.0 => /usr/local/cuda/lib64/libcublas.so.9.0 
        libcurand.so.9.0 => /usr/local/cuda/lib64/libcurand.so.9.0 
        libcusolver.so.9.0 => /usr/local/cuda/lib64/libcusolver.so.9.0 
        libmklml_intel.so => /home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmklml_intel.so 
        libiomp5.so => /home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libiomp5.so 
        librt.so.1 => /lib64/librt.so.1
        libmkldnn.so.0 => /home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libmkldnn.so.0
        libdl.so.2 => /lib64/libdl.so.2 
        libgfortran.so.3 => /home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libgfortran.so.3 
        libcufft.so.9.0 => /usr/local/cuda/lib64/libcufft.so.9.0 
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 
        libm.so.6 => /lib64/libm.so.6 
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 
        libpthread.so.0 => /lib64/libpthread.so.0 
        libc.so.6 => /lib64/libc.so.6 
        /lib64/ld-linux-x86-64.so.2 
        libquadmath.so.0 => /home/ec2-user/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/libquadmath.so.0
#3

Thanks for the response. I’m trying to build the binary once that is forward compatible both in terms of GPU generation and Cuda versions.

For other cuda applications, this is possible when you specify ‘arch’ and ‘cudart’ to statically link the runtime into the binary, but I always want this to be true for MxNet as well.

#4

When you build MXNet from source, it will create a shared and a static library. Did you set the CMAKE options CUDA_USE_STATIC_CUDA_RUNTIME and CUDA_cudart_static_LIBRARY?

#5

Thanks for the information.

But could you explain how I should use these options?

Currently, I’m building ‘libmxnet.so’ from source, then install python bindings. At which point should I use those two options you mentioned?

#6

Before building MXNet, you can go to the src directory and configure it with ccmake . In case you don’t have ccmake installed: sudo apt-get install cmake-curses-gui. Then in the src directory of MXNet do:
ccmake .
Then you need to configure the CMakeCache, this will load all the options. You may need to press t to switch into the advanced mode (some options may be hidden). Once all options are loaded, you can start configuring them (e.g. setting options like CUDA_USE_STATIC_CUDA_RUNTIME). Once this is done you need to configure CMake: this will check all the dependencies. If there are no missing dependencies you can generate the Makefile afterwards. Once you have the Makefile you can compile MXNet with make.