The complied mxnet from latest master branch is slower than pip installed version?

mg0880gm · October 19, 2017, 10:28pm

Hi,

In order to improve the performance, I clone the master branch and recompile from source based on the tutorial here: https://mxnet.incubator.apache.org/get_started/install.html. The make command I used was:

make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1 USE_PROFILER=1

After install the compiled mxnet, the train script ran even slower. Previously every 500 batches ran for about 3 minutes 55 seconds. The same script and same datasets ran for about 4 minutes 10 secs for the re-compiled version. I also uninstalled and re-compiled again without profiler enabled, but it did the same.

Then I uninstalled it, and installed mxnet-cu80 using pip install again. The speed was back to about 3 minutes 55 sec. The script uses single GPU to train neural network model. In term of the performance, is there any optimization that could be done for compiling from source?

madjam · October 19, 2017, 11:01pm

Which version did you pip install?

Can you check if the pre release pip version also has the performance issue? If so, that could be a regression.

pip install mxnet-cu80 --pre --user

Topic		Replies	Views
Training speed in MXNet is nearly 2.5x times slower than Pytorch	8	2981	January 20, 2019
Slow speed of mallocing gpu memory using mxnet built from source Performance	1	359	December 10, 2018
`MXImperativeInvokeEx` is taking a long time Performance	8	772	January 6, 2019
Mxnet with cudnn7.1 is little slower than cudnn5.1 Discussion	1	407	July 13, 2018
Manually compile MXNet1.3.0 release code, which is very inefficient after testing	1	491	November 14, 2018

The complied mxnet from latest master branch is slower than pip installed version?

Related Topics