What's the meaning of 0 dimensions in shape and when to use it?


Hi guys,

I find that in current rnn_cell.py, the state_info of RNNCell/LSTMCell/GRUCell returns the shape with 0 dimension as below:

    def state_info(self):
        return [{'shape': (0, self._num_hidden), '__layout__': 'NC'},
                {'shape': (0, self._num_hidden), '__layout__': 'NC'}]

Could any one tell me the meaning of using 0 dimensions in shape here, and when it should be used? As far as I know 0 dimension in shape implies the size of ndarray is 0.



0 does not mean that the size of ndarray is 0. it means copy over size from corresponding dimension of the ndarray you’re reshaping.

See http://mxnet.incubator.apache.org/test/versions/0.10/api/python/ndarray.html#mxnet.ndarray.NDArray.reshape


Hi Sad,

Thanks for your kindly reply.
For reshape operation, I agree with you.

But for the state_info() of RNNCell/LSTMCell/GRUCell, the returned state has the shape (0, self._num_hidden) which will be used when doing the Infershape .

The reason I asked this question is I met an error during binding stage when trying to do the quantization on a LSTM/RNN Model, since the shape of begin_state is (0, self._num_hidden) as mentioned above which results in the error of quantized_fully_connected as below

mxnet.base.MXNetError: Error in operator quantized_encoder_birnn_forward_l0_t0_h2h: [09:06:20] src/operator/quantization/quantized_fully_connected.cc:41: Check failed: !shape_is_none(in_shape->at(0)) QuantizedFullyConnectedOp input data shape must be given

Any comments? Thanks!


Zero in shape means that the corresponding dimension size has not yet been inferred. Many layers have parameters with dimensions that depend on data shape. When you create a network typically these dimensions are not explicitly specified by the user and instead are inferred on the first forward call to the network.


Thanks for your info, safrooze.
When you mentioned “the first forward call to the network”, do you mean these 0-value dimensions will be determined when running forward with the first batch data, or some other input information during binding stage?


Yes, the dimensions are inferred on the first forward call using the first batch of data.


It’s cleared now, thanks for the helpful information :slight_smile: