Slow to retrieve data after inference

lucas · August 8, 2019, 4:06pm

Hello,

I’m using gluoncv for a detection task and I have encountered something that is really weird to me :
Inferences are very quick but the first time I call the result of the inference, it takes ages, and it depends on model used.

For exemple :
With ssd :
‘‘cid, score, bbox = net(batch_img)’’ takes 0.013 sec
‘‘cid = cid.asnumpy()’’ takes 0.2 sec

With yolo :
‘‘cid, score, bbox = net(batch_img)’’ takes 0.014 sec
‘‘cid = cid.asnumpy()’’ takes 1.5 sec

The two lines of code are called consecutively, and in fact, I can change the second line, if I call cid, score, or bbox to print or do any operation with them, the first one is always very slow, then following ones are quick again.

So if i do :

‘‘cid, score, bbox = net(batch_img)
print(cid)
cid = cid.asnumpy()’’

the first line will take 0.014 sec, the second 1.5sec and the third one 0.00004 sec

Can someone explain why is this happening and how to speed up things ? I really want to use yolo as detector since its bboxes are way more relevant than ssd’s but it makes it way too long.

Thank you for your help

lgo · August 8, 2019, 8:03pm

Hi Lucas,

the calls to net() are actually asynchronous, so they will return directly. It’s only when you try to access the result of net(), either when you convert the mx array to the numpy array, or when you print cid, that the access to that result blocks until the inference has finished.

If your inference is slow it is because those calculations take time. Try using a GPU, try using less classes, try making the input image smaller, using a model based on a smaller base network (MobileNet)…

hth,

Lieven

lucas · August 9, 2019, 9:04am

Hi Lieven,

Thank you very much for your quick answer. I understand better now. I’ve tried with a single gpu as you suggested and now my speed problem is gone.

Thanks again and I apologize for the noob question,

Lucas

spanev · August 9, 2019, 3:21pm

Also, for debugging purpose, you can run MXNet with the env var MXNET_ENGINE_TYPE=NaiveEngine which will MXNet use a synchronous engine instead of the default ThreadedEnginePerDevice which gives you the asynchronicity.

More info about it here https://mxnet.incubator.apache.org/versions/master/faq/env_var.html#engine-type

Topic		Replies	Views
Access of GPU array very slow Performance	2	1042	March 27, 2019
Speed Issue converting NDarray to np.array Performance	2	660	August 21, 2019
Memory leak when running cpu inference Gluon python , memory , gluon-cv	10	4585	January 22, 2020
FPS for object detection inference on GPU Gluon	6	1629	January 28, 2020
What wrong with MxNet asnumpy()? Performance	2	1656	July 29, 2019

Slow to retrieve data after inference

Related Topics