Using gluon/image_classification.py img/sec speed up when metric update and reset when turned off

vgangal101 · September 18, 2019, 4:55pm

I am working with the example/gluon/image_classification.py script in the apache incubator mxnet github and was tinkering with the script. I have observed that when I run the following command :

python image_classification.py --model resnet50_v1 --dataset dummy --epochs 1 --log-interval 1 --batch-size 64 --gpu 0

I get a img/sec value of 323.0 img/sec on a Tesla V100 gpu

The same command if I run with the metrics computations turned off ( lines 212,229) I get about 1400 img/sec.

Why am I seeing this speed up ? Why does turning off the metric computations result in such a massive increase in img/sec

Environment :
miniconda environment with mxnet-cu101 , cuda/10.1 , cudnn/7.4, python 2.7.15
mxnet was pip installed
Tests run on a Tesla V100 gpu

NRauschmayr · September 20, 2019, 5:31am

The speedup is kind of expected. Metric.update converts the MXNet Ndarrays into Numpy, which causes the slowdown. You can have a look here on the code: https://mxnet.incubator.apache.org/_modules/mxnet/metric.html#CustomMetric.update It is calling asnumpy() which is a blocking call. There is also a Github issue related to it: https://github.com/apache/incubator-mxnet/issues/9571

Topic		Replies	Views
Surprisingly low training performance on volta V100 Performance	5	1407	June 22, 2018
Gluon implementation much slower than Symbolic Performance	9	1700	August 20, 2018
Training speed in MXNet is nearly 2.5x times slower than Pytorch	8	2977	January 20, 2019
Improve the speed of evaluating the metric Discussion	1	608	April 11, 2018
Very slow initialisation of GPU distributed training Gluon	7	1292	September 7, 2020

Using gluon/image_classification.py img/sec speed up when metric update and reset when turned off

Related Topics