I am using the Gluon package to run a Faster RCNN Coco-trained model (Link here - this is for ssd but I’m using a FRCNN model).
The issue I’m having is that when I convert the output arrays for bounding_boxes, scores & class_IDs it takes a very long time. This is because the mxnet engine is asynchronous, and the system needs to finish computations before converting the array. This time factor is killing me as we have thousands of images to perform detentions on, each taking about 1-2 seconds.
The FasterRCNN output gives us an 80000 element array, although only the first 10 are needed. Slicing the array to be the first 10 and using .asnumpy() still takes a long time though, because it is still computing the other elements…
I can only think of two solutions to speed up the code:
- Make the initial output of the FRCNN network shorter eg max ouput of array length 10
- Use environment variables to change the engine to a synchronous engine (Under “Engine Type”)
- Somehow find a way to make asnumpy() faster
Can anyone help?