Why the `mAP` of pretrained model `ssd_512_mobilenet1.0_voc` is only 0.302?

Hi,
I’ve downloaded the pretrained model ssd_512_mobilenet1.0_voc from model_zoo, and tested it using VOC2007 test.txt images. However, the final mAP on these images was only 0.301, in fact, it is said that this model can achieve 75.4 mAP on this page: Detection Model - GluonCV. Below is my main codes used to get the mAP.

             x, image = gcv.data.transforms.presets.ssd.load_test(img_file, 512)      # resize the short side to 512 and keep aspect ratio
             scale = np.max(x.shape[2:]) / np.max(img.shape[:2])
             mod.predict(x)
             cls, score, bbox = mod.get_outputs()
 
             gt_bboxes, gt_labels = parse_xml_get_labels(ann_file, scale)
             gt_bboxes = nd.array([gt_bboxes])
             gt_labels = nd.array([gt_labels])
 
             eval_map.update(bbox, cls, score, gt_bboxes, gt_labels)

where eval_map is the instance of VOCMApMetric, and is defined as below:

map_eval = metrics.voc_detection.VOCMApMetric(class_names=classes)

Any advice? thanks in advance.

Update, adding following AP for each class:
([‘aeroplane’, ‘bicycle’, ‘bird’, ‘boat’, ‘bottle’, ‘bus’, ‘car’, ‘cat’, ‘chair’, ‘cow’, ‘diningtable’, ‘dog’, ‘horse’, ‘motorbike’, ‘person’, ‘pottedplant’, ‘sheep’, ‘sofa’, ‘train’, ‘tvmonitor’, ‘mAP’], [0.20737636609519228, 0.4436244683931908, 0.17632512042747953, 0.15485441267009287, 0.17678841525254013, 0.358982424041574, 0.49986873675690324, 0.30666527055099424, 0.26257164243189357, 0.33899464516695377, 0.13646118619561104, 0.306361505033263, 0.32900757094999067, 0.4029432468376276, 0.5798341983795638, 0.15452015604063923, 0.23183657997975202, 0.20495553684733925, 0.3032973551220125, 0.460550103376441, 0.30179094702745274])

How did you create your network? It’s possible that the metric was generated on a larger test set, or with a different nms_thresh/nms_topk.

Please post the minimal reproducible code.