Forward pass performance (for one image) is quite slow. Concerns mxnet 0.11.0


#1

I trained a Densenet from scratch (net = mx.gluon.model_zoo.vision.densenet121(classes=53) ) and I am satisfied with the @topk accuracy on the test set. However, when I measure the speed for one forward pass (one image sized (224,224)) I am not able to go lower than about 200ms/image. I know from my measurements that the performance significantly is influenced by the function .asnumpy() when doing the predictions. I understood that NDarray computations are async and .asnumpy() includes waiting the computation be done.

My question is if this 200ms are far off with an Intel® Xeon® CPU E5-2620 v3 @ 2.40GHz or if this is the performance I can expect. Any experience in this direction?


#2

You probably need to warm up the engine and average inference time.
If you are still getting the similar results, checkout the build, e.g. BLAS(openblas, mkl…)


#3

Agree, MKL will be accelerated a lot!