Non-square input image into ssd

kargarisaac · February 19, 2019, 12:04pm

hello,
I want to feed an image into ssd. The image size is non-square and I want to square and feed it into ssd using gcv.data.transforms.presets.ssd.load_test(), but it gets just the short side size. How can I square it withoud using imresize function from mxnet.image()? The gcv.data.transforms.presets.ssd.load_test() command also get the file path and load it. Is there any other command to get a pre-loaded image? I have my image as a numpy array but want to use this function to do normalization and whatever it does.

ThomasDelteil · February 19, 2019, 6:44pm

@kargarisaac, SSD should support by default rectangular images as well. Are you sure you want to crop the image to a square?

You can use this transform to transform your image and put it in your model:

import mxnet as mx
import gluoncv as gcv
import cv2

MIN_SIZE=300
MAX_SIZE=500

mx.test_utils.download('https://helpx.adobe.com/in/stock/how-to/visual-reverse-image-search/_jcr_content/main-pars/image.img.jpg/visual-reverse-image-search-v2_1000x560.jpg', 'test.jpg')
img = cv2.imread('test.jpg')[:,:,::-1]
print('Original',img.shape)
img_tf, img_np = gcv.data.transforms.presets.ssd.transform_test(mx.nd.array(img), short=MIN_SIZE, max_size=MAX_SIZE)
print('NDArray for MXNet', img_tf.shape)
print('Numpy image', img_np.shape)

Original (560, 1000, 3)
NDArray for MXNet (1, 3, 280, 500)
Numpy image (280, 500, 3)

However if you really want to get your image as a square I would suggest to simply write your own transform:

SIZE=300

transform = mx.gluon.data.vision.transforms.Compose([
    mx.gluon.data.vision.transforms.Resize(size=SIZE, keep_ratio=True),
    mx.gluon.data.vision.transforms.CenterCrop(SIZE),
    mx.gluon.data.vision.transforms.ToTensor(),
    mx.gluon.data.vision.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))
    
])

mx.test_utils.download('https://helpx.adobe.com/in/stock/how-to/visual-reverse-image-search/_jcr_content/main-pars/image.img.jpg/visual-reverse-image-search-v2_1000x560.jpg', 'test.jpg')
img = cv2.imread('test.jpg')[:,:,::-1]
img_tf = transform(mx.nd.array(img)).expand_dims(axis=0)

print('Original', img.shape)
print('Transformed', img_tf.shape)

Original (560, 1000, 3)
Transformed (1, 3, 300, 300)

However this calls the mx.image under the hood. If you really don’t want to use the opencv related function, you can use the mx.nd.contrib.BilinearResize2D operator, however you would need to compute yourself the correct width and height in order to keep the ratio. You would need to do the center cropping yourself, but it’s all pretty simple using the img.shape informations.

you can define a function like this

def resize_and_crop(x):
    ...

and then add it to the pipeline like this:

transform = mx.gluon.data.vision.transforms.Compose([
    mx.gluon.data.vision.transforms.ToTensor(),
    mx.gluon.data.vision.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))
    mx.gluon.nn.Lambda(lambda x: resize_and_crop(x.expand_dims(axis=0))
])

kargarisaac · February 20, 2019, 5:54am

@ThomasDelteil
Thank you for your answer.
When we train our model with 300*300 input image, can we feed an image with different size into it? I thought maybe this decrease the performance. I’m not sure.

ThomasDelteil · February 21, 2019, 8:01pm

Two things to consider:

Within a batch all images must have the same size, in order to be able to stack them and lay them out nicely in memory
If you use different sizes, MXNet runs a convolution optimization algorithm for every new shape it encounters, consider disabling it MXNET_CUDNN_AUTOTUNE_DEFAULT=0 and check what performance you get.

Topic		Replies	Views
Data loader with rectangular images for object-detections Discussion	2	488	March 30, 2020
Strange expect NDArray error for ssd.transform_test Gluon	3	942	October 22, 2019
SSD - MultiboxTarget returns 0 for everything! What does the function do in detail? Gluon	0	554	December 4, 2018
How to float16 gluoncv SSD finetuning? Gluon	3	688	December 28, 2019
Help with SSD SmoothL1 metric reporting NaN during training Gluon	7	1378	December 27, 2023

Non-square input image into ssd

Related Topics