Run c++ package example with gpu error

when I run c++ examples with gpu, I got following errors, how to fix it?

terminate called after throwing an instance of ‘dmlc::Error’
what(): [00:29:01] …/include/mxnet-cpp/ndarray.hpp:54: Check failed: MXNDArrayCreate(shape.data(), shape.size(), context.GetDeviceType(), context.GetDeviceId(), delay_alloc, &handle) == 0 (-1 vs. 0)

Stack trace returned 7 entries:
[bt] (0) ./mlp_gpu(dmlc::StackTraceabi:cxx11+0x54) [0x4098b6]
[bt] (1) ./mlp_gpu(dmlc::LogMessageFatal::~LogMessageFatal()+0x2a) [0x409b82]
[bt] (2) ./mlp_gpu() [0x40daad]
[bt] (3) ./mlp_gpu() [0x40d389]
[bt] (4) ./mlp_gpu() [0x4076d6]
[bt] (5) /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f68f94a4830]
[bt] (6) ./mlp_gpu() [0x405ae9]

Could you tell me how you built MXNet, and how you built the example?
For building MXNet for CPP, you need to specify the USE_CPP_PACKAGE=1 flag.
To build the examples run make all in the example folder.
You can copy libmxnet.so in the example folder as well.
What version of MXNet are you using?

This is with latest master:

ubuntu@ip-172-31-23-125:~/incubator-mxnet/cpp-package/example$ ./mlp_gpu
[21:50:29] src/io/iter_mnist.cc:110: MNISTIter: load 60000 images, shuffle=1, shape=(100,784)
[21:50:30] src/io/iter_mnist.cc:110: MNISTIter: load 10000 images, shuffle=1, shape=(100,784)
[21:50:33] mlp_gpu.cpp:135: Epoch[0] 178571 samples/sec Train-Accuracy=0.111983
[21:50:33] mlp_gpu.cpp:150: Epoch[0] Val-Accuracy=0.1135
[21:50:33] mlp_gpu.cpp:135: Epoch[1] 182371 samples/sec Train-Accuracy=0.335383
[21:50:33] mlp_gpu.cpp:150: Epoch[1] Val-Accuracy=0.6327
[21:50:34] mlp_gpu.cpp:135: Epoch[2] 174927 samples/sec Train-Accuracy=0.804833
[21:50:34] mlp_gpu.cpp:150: Epoch[2] Val-Accuracy=0.8497
[21:50:34] mlp_gpu.cpp:135: Epoch[3] 182371 samples/sec Train-Accuracy=0.8791
[21:50:34] mlp_gpu.cpp:150: Epoch[3] Val-Accuracy=0.889
[21:50:34] mlp_gpu.cpp:135: Epoch[4] 184615 samples/sec Train-Accuracy=0.905467
[21:50:34] mlp_gpu.cpp:150: Epoch[4] Val-Accuracy=0.9075
[21:50:35] mlp_gpu.cpp:135: Epoch[5] 183486 samples/sec Train-Accuracy=0.918267
[21:50:35] mlp_gpu.cpp:150: Epoch[5] Val-Accuracy=0.9186
[21:50:35] mlp_gpu.cpp:135: Epoch[6] 187500 samples/sec Train-Accuracy=0.925717
[21:50:35] mlp_gpu.cpp:150: Epoch[6] Val-Accuracy=0.926
[21:50:35] mlp_gpu.cpp:135: Epoch[7] 184049 samples/sec Train-Accuracy=0.929633
[21:50:36] mlp_gpu.cpp:150: Epoch[7] Val-Accuracy=0.93
[21:50:36] ../include/mxnet-cpp/lr_scheduler.h:81: Update[5001]: Change learning rate to 0.01
[21:50:36] mlp_gpu.cpp:135: Epoch[8] 182927 samples/sec Train-Accuracy=0.93525
[21:50:36] mlp_gpu.cpp:150: Epoch[8] Val-Accuracy=0.9387
[21:50:36] mlp_gpu.cpp:135: Epoch[9] 188679 samples/sec Train-Accuracy=0.93895
[21:50:36] mlp_gpu.cpp:150: Epoch[9] Val-Accuracy=0.9397

I use the latest master version, and my compile command as follows:

make -j8 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1 USE_CPP_PACKAGE=1

I add log in mlp_gpu.cpp and got following msg:

[16:58:20] mlp_gpu.cpp:82: [16:58:20] src/imperative/./imperative_utils.h:76: GPU support is disabled. Compile MXNet with USE_CUDA=1 to enable GPU support.

I also try to copy libmxnet.so from pip install path(I can use gpu in my python code), bug I got the same error.

the error you are getting is the error when MXNet is built without GPU support (as the error correctly state). If you have built everything correctly as stated and copied the libmxnet.so from the built lib folder into the example folder and also tried to copy the libmxnet.so from the pip install mxnet-cu9x install folder and still doesn’t work, my guess is that it is picking up a wrong libmxnet.so from somewhere else.

Here is the different places that your executable is going to look for your dynamic libraries: https://en.wikipedia.org/wiki/Rpath

I would suggest having a look at that and try to find where your libmxnet.so is loaded from. Also try to look for all libmxnet.so in your system.

Thanks Thomas, I solve it by replacing /usr/local/lib/libmxnet.so with my compiled lib.