The target is face, so can’t say those targets are big, but the bounding boxes of the FDDB(test set) do much larger than the training set(data from kaggle, I adjust the bounding boxes), because the bounding boxes of FDDB enclose the head of humans, but the training set enclose five sense organs.
You could try plotting the predicted bounding boxes, and compare to the truth.
Tried, looks fine, accuracy is close to perfect, do not find any mis-classify example
But there are a few faces cannot detect
You could also try increasing the IOU (e.g. 0.99) and see if mAP <1.0 then. See if they provide any clues.
Tried too, if increase mAP to 0.6(maybe is 0.7), mAP become zero, maybe this is because bounding boxes of training set enclose five sense organs but test set enclose head of humans.
There are 2845 images for testing, may contain more than 6000 faces(haven’t counted), I think this is big enough for testing(training set only got 406 images).
I wonder too