I read through the object-detection tutorial to finetune an existing network for object-detection.
and I also checked this script
So I understood that for SSD the images are resized such that the shorter side is 512 or 300 depending on the network.
In the example with the Pikachu dataset the images are squared (datashape = 512), and so the data- loader is configured with square dimensions
width, height = datashape, datashape
SSDDefaultTrainTransform(width, height, anchors)
I was wondering if I have rectangular and not squared images should I modify those lines to match this case ?
Like if I know that my images will be resized such that the shorter side is going to be 512, then I could apply the same ratio to calculate the expected length of the longer side and use that for width, height above.