Slow to retrieve data after inference

Hello,

I’m using gluoncv for a detection task and I have encountered something that is really weird to me :
Inferences are very quick but the first time I call the result of the inference, it takes ages, and it depends on model used.

For exemple :
With ssd :
‘‘cid, score, bbox = net(batch_img)’’ takes 0.013 sec
‘‘cid = cid.asnumpy()’’ takes 0.2 sec

With yolo :
‘‘cid, score, bbox = net(batch_img)’’ takes 0.014 sec
‘‘cid = cid.asnumpy()’’ takes 1.5 sec

The two lines of code are called consecutively, and in fact, I can change the second line, if I call cid, score, or bbox to print or do any operation with them, the first one is always very slow, then following ones are quick again.

So if i do :

‘‘cid, score, bbox = net(batch_img)
print(cid)
cid = cid.asnumpy()’’

the first line will take 0.014 sec, the second 1.5sec and the third one 0.00004 sec

Can someone explain why is this happening and how to speed up things ? I really want to use yolo as detector since its bboxes are way more relevant than ssd’s but it makes it way too long.

Thank you for your help

Hi Lucas,

the calls to net() are actually asynchronous, so they will return directly. It’s only when you try to access the result of net(), either when you convert the mx array to the numpy array, or when you print cid, that the access to that result blocks until the inference has finished.

If your inference is slow it is because those calculations take time. Try using a GPU, try using less classes, try making the input image smaller, using a model based on a smaller base network (MobileNet)…

hth,

Lieven

Hi Lieven,

Thank you very much for your quick answer. I understand better now. I’ve tried with a single gpu as you suggested and now my speed problem is gone.

Thanks again and I apologize for the noob question,

Lucas

Also, for debugging purpose, you can run MXNet with the env var MXNET_ENGINE_TYPE=NaiveEngine which will MXNet use a synchronous engine instead of the default ThreadedEnginePerDevice which gives you the asynchronicity.

More info about it here https://mxnet.incubator.apache.org/versions/master/faq/env_var.html#engine-type :slight_smile: