Multiscale Object Detection

astonzhang · December 1, 2018, 4:59am

http://en.diveintodeeplearning.org/chapter_computer-vision/multiscale-object-detection.html

mseeger · June 3, 2019, 3:13pm

Hello,

in display_anchors, should it not be:

fmap = nd.zeros((1, 10, fmap_h, fmap_w))

so fmap_w and fmap_h are flipped?

This confused me quite a bit. Images and feature maps are encoded (n, c, h, w), so height by width, meaning that you index them as (y, x).
But then, your boxes are indexed (x,y), so (x, y, w, h).

This means the input to MultiBoxPrior is (h, w), but its output is (w, h).

It may be worth commenting on this flipping between heigth and width somewhere, it is confusing.

ThomasDelteil · June 3, 2019, 5:41pm

@mseeger I agree it’s confusing, moving between the NDArray tensor definition h,w and the point-style definition with x,y.

github.com

apache/incubator-mxnet/blob/master/src/operator/contrib/multibox_prior.cu#L41


  /* Code block avoids redefinition of cudaError_t error */ \
  do { \
    cudaError_t error = condition; \
    CHECK_EQ(error, cudaSuccess) << " " << cudaGetErrorString(error); \
  } while (0)


namespace mshadow {
namespace cuda {
template<typename DType>
__global__ void AssignPriors(DType *out, const float size,
                             const float sqrt_ratio, const int in_width,
                             const int in_height, const float step_x,
                             const float step_y, const float center_offy,
                             const float center_offx, const int stride,
                             const int offset) {
  int index = blockIdx.x * blockDim.x + threadIdx.x;
  if (index >= in_width * in_height) return;
  int r = index / in_width;
  int c = index % in_width;
  float center_x = (c + center_offx) * step_x;
  float center_y = (r + center_offy) * step_y;

Looking at the MultiboxPrior operator definition, it looks like they use the convention of width first, then height. Which is quite non-standard indeed as if you reference to your coordinate as the height and width, it is usually in the height, width order.

Topic		Replies	Views
Single Shot Multibox Detection (SSD) D2L Book	3	776	November 27, 2019
Question regarding ssd algorithm Gluon	1	1236	July 3, 2018
Bounding boxes ratio and scales in MutliboxPrior and MultiboxTarget function Discussion	4	1215	January 11, 2019
How to get anchor box's class id from contrib.nd.MultiBoxDetection's response in multi-class object detection setup	1	342	October 31, 2019
What is wrong with this Neural network? I have visualization of error over epochs and mae. Am I using wrong error function? Discussion	2	508	January 2, 2019

Multiscale Object Detection

Related Topics