I am getting surprisingly low performance when running the examples/image-classification/train_imagenet.py example on my Tesla V100 gpu. I am getting roughly 540 img/sec with a batch size of 64 with synthetic data when training either Alexnet or Resnet50. I was expecting Alexnet training performance to be much more than this.
This is using the python mxnet-cu91 package. Any guidance will be useful.
I made an issue request here as well if you need more details: