I want to feed an image into ssd. The image size is non-square and I want to square and feed it into ssd using
gcv.data.transforms.presets.ssd.load_test(), but it gets just the short side size. How can I square it withoud using
imresize function from
gcv.data.transforms.presets.ssd.load_test() command also get the file path and load it. Is there any other command to get a pre-loaded image? I have my image as a numpy array but want to use this function to do normalization and whatever it does.
@kargarisaac, SSD should support by default rectangular images as well. Are you sure you want to crop the image to a square?
You can use this transform to transform your image and put it in your model:
import mxnet as mx import gluoncv as gcv import cv2 MIN_SIZE=300 MAX_SIZE=500 mx.test_utils.download('https://helpx.adobe.com/in/stock/how-to/visual-reverse-image-search/_jcr_content/main-pars/image.img.jpg/visual-reverse-image-search-v2_1000x560.jpg', 'test.jpg') img = cv2.imread('test.jpg')[:,:,::-1] print('Original',img.shape) img_tf, img_np = gcv.data.transforms.presets.ssd.transform_test(mx.nd.array(img), short=MIN_SIZE, max_size=MAX_SIZE) print('NDArray for MXNet', img_tf.shape) print('Numpy image', img_np.shape)
Original (560, 1000, 3) NDArray for MXNet (1, 3, 280, 500) Numpy image (280, 500, 3)
However if you really want to get your image as a square I would suggest to simply write your own transform:
SIZE=300 transform = mx.gluon.data.vision.transforms.Compose([ mx.gluon.data.vision.transforms.Resize(size=SIZE, keep_ratio=True), mx.gluon.data.vision.transforms.CenterCrop(SIZE), mx.gluon.data.vision.transforms.ToTensor(), mx.gluon.data.vision.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)) ]) mx.test_utils.download('https://helpx.adobe.com/in/stock/how-to/visual-reverse-image-search/_jcr_content/main-pars/image.img.jpg/visual-reverse-image-search-v2_1000x560.jpg', 'test.jpg') img = cv2.imread('test.jpg')[:,:,::-1] img_tf = transform(mx.nd.array(img)).expand_dims(axis=0) print('Original', img.shape) print('Transformed', img_tf.shape)
Original (560, 1000, 3) Transformed (1, 3, 300, 300)
However this calls the
mx.image under the hood. If you really don’t want to use the opencv related function, you can use the
mx.nd.contrib.BilinearResize2D operator, however you would need to compute yourself the correct width and height in order to keep the ratio. You would need to do the center cropping yourself, but it’s all pretty simple using the
you can define a function like this
def resize_and_crop(x): ...
and then add it to the pipeline like this:
transform = mx.gluon.data.vision.transforms.Compose([ mx.gluon.data.vision.transforms.ToTensor(), mx.gluon.data.vision.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)) mx.gluon.nn.Lambda(lambda x: resize_and_crop(x.expand_dims(axis=0)) ])
Thank you for your answer.
When we train our model with 300*300 input image, can we feed an image with different size into it? I thought maybe this decrease the performance. I’m not sure.
Two things to consider:
- Within a batch all images must have the same size, in order to be able to stack them and lay them out nicely in memory
- If you use different sizes, MXNet runs a convolution optimization algorithm for every new shape it encounters, consider disabling it
MXNET_CUDNN_AUTOTUNE_DEFAULT=0and check what performance you get.