Why gluoncv yolo3_darknet53_coco result is size 100?

I’m using yolo3_darknet53_coco from the gluoncv zoo and the output for a batch 1 is size 100, with one true class and all the other placeholders padded with a “-1”. Why is that?

Did you change the default settings for post_nms?
https://gluon-cv.mxnet.io/_modules/gluoncv/model_zoo/yolo/yolo3.html

If you change the post_nms to -1 you should be good.

1 Like

to be more specific, you can set the result using:
net.set_nms(post_nms=-1)

thanks for following up!
what I mean is why the output is size 100 ?

got it, saw what you refer to in the doc:

"post_nms : int, default is 100
        Only return top `post_nms` detection results, the rest is discarded. The number is
        based on COCO dataset which has maximum 100 objects per image. You can adjust this
        number if expecting more objects. You can use -1 to return all detections."
prediction = network(data)

    # But this time our prediction isn't an MXNet ndarray. 
    # It's a tuple instead. When using detection models, you can expect three MXNet ndarrays to be returned. 
    # We can loop through the tuple and print out the shape of these arrays.    
#     for index, array in enumerate(prediction):
#          print('#{} shape{}'.format(index + 1, array.shape))

        
        #1 shape(1, 100, 1) -- The first array contains the object class indexes.
        #2 shape(1, 100, 1) -- The second array contains the object class probabilities.
        #3 shape(1, 100, 4) -- And the last array contains the object bounding box coordinates.
        # But first, notice how the shape of each of these arrays starts with a 1, 100. Why is this?
        # Well, we gave the network a batch of one image, so we get back a batch of one prediction.
        # And our model can predict up to 100 objects in a single image.
        
        # So for the first array, with a shape of 1, 100, 1, we have 1 image, 100 potential objects,
        # and 1 class index per object. 
        
        # And for the last array, with shape 1, 100, 4, we have 1 image, 100 potential objects. 
        # And 4 values for each object to define its bounding box
        
        # Since we're only performing object detection on one image, 
        # let's remove the additional batch dimension for all of the arrays. 
        
    # unpack prediction
    class_ids, scores, bounding_boxes = prediction
#     prediction = [array[0] for array in prediction]
#     class_indicies, probabilities, boundingBoxes = prediction

#     print('class_indicies:', class_indicies[0:10])
#     print('probabilities:', probabilities[0:10] )
#     print('probabilities:', boundingBoxes[0:10] )