What's the meaning of 0 dimensions in shape and when to use it?


#1

Hi guys,

I find that in current rnn_cell.py, the state_info of RNNCell/LSTMCell/GRUCell returns the shape with 0 dimension as below:

    def state_info(self):
        return [{'shape': (0, self._num_hidden), '__layout__': 'NC'},
                {'shape': (0, self._num_hidden), '__layout__': 'NC'}]

Could any one tell me the meaning of using 0 dimensions in shape here, and when it should be used? As far as I know 0 dimension in shape implies the size of ndarray is 0.

Thanks!


#2

0 does not mean that the size of ndarray is 0. it means copy over size from corresponding dimension of the ndarray you’re reshaping.

See http://mxnet.incubator.apache.org/test/versions/0.10/api/python/ndarray.html#mxnet.ndarray.NDArray.reshape


#3

Hi Sad,

Thanks for your kindly reply.
For reshape operation, I agree with you.

But for the state_info() of RNNCell/LSTMCell/GRUCell, the returned state has the shape (0, self._num_hidden) which will be used when doing the Infershape .

The reason I asked this question is I met an error during binding stage when trying to do the quantization on a LSTM/RNN Model, since the shape of begin_state is (0, self._num_hidden) as mentioned above which results in the error of quantized_fully_connected as below

mxnet.base.MXNetError: Error in operator quantized_encoder_birnn_forward_l0_t0_h2h: [09:06:20] src/operator/quantization/quantized_fully_connected.cc:41: Check failed: !shape_is_none(in_shape->at(0)) QuantizedFullyConnectedOp input data shape must be given

Any comments? Thanks!


#4

Zero in shape means that the corresponding dimension size has not yet been inferred. Many layers have parameters with dimensions that depend on data shape. When you create a network typically these dimensions are not explicitly specified by the user and instead are inferred on the first forward call to the network.


#5

Thanks for your info, safrooze.
When you mentioned “the first forward call to the network”, do you mean these 0-value dimensions will be determined when running forward with the first batch data, or some other input information during binding stage?


#6

Yes, the dimensions are inferred on the first forward call using the first batch of data.


#7

It’s cleared now, thanks for the helpful information :slight_smile: