My first experiments with mxnet show a speed difference of at least a factor of 2 (in some models even 4-5) between the module/symbol API (which is faster) and the gluon API (which is slower).
I am currently very new to mxnet and it is quite likely that my approach contains fundamental flaws that might explain the seen differences. I cannot see how, though, as I have taken the code from the mxnet web-site from tutorials and API documentation.
While I noticed the problem with mxnet-cu90mkl in version 1.3.1, I am able to reproduce the problem with the raw mxnet 1.3.1 installation purely on CPU with a very simple 3 layer MLP architecture.
I’ve created a github repository with jupyter notebook(s) to show what I have done:
I’ve also provided the conda environment with exact versions to reproduce what I am seeing.
You can also go directly to the jupyter notebook here: https://nbviewer.jupyter.org/github/cs224/mxnet-gluon-vs-sym-and-others/blob/master/mxnet-gluon-vs-sym-speed.ipynb?flush_cache=true
My questions is if this speed difference is expected or if I am doing something fundamentally wrong?