LSTM Split0 Operator Error


Reposting from Github Issue Upon Request:


Getting an error in the split0 operator when training an image captioning network in mxnet.

Package used (Python/R/Scala/Julia): I’m using Python

Error Message:


Creating Iterators...
Initiating Training...
INFO:root:Epoch[0] Train-perplexity=655.513238
INFO:root:Epoch[0] Time cost=1.261
infer_shape error. Arguments:
  image_feature: (50, 1024)
  word_data: (50, 77)
  softmax_label: (50,)
Traceback (most recent call last):
  File "", line 102, in <module>
    epoch_end_callback=mx.callback.do_checkpoint(checkpoints_prefix, period=10)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/module/", line 528, in fit
    batch_end_callback=eval_batch_end_callback, epoch=epoch)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/module/", line 244, in score
    self.forward(eval_batch, is_train=False)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/module/", line 608, in forward
    self.reshape(new_dshape, new_lshape)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/module/", line 470, in reshape
    self._exec_group.reshape(self._data_shapes, self._label_shapes)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/module/", line 381, in reshape
    self.bind_exec(data_shapes, label_shapes, reshape=True)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/module/", line 357, in bind_exec
    allow_up_sizing=True, **dict(data_shapes_i + label_shapes_i))
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/", line 402, in reshape
    arg_shapes, _, aux_shapes = self._symbol.infer_shape(**kwargs)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/symbol/", line 989, in infer_shape
    res = self._infer_shape_impl(False, *args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/symbol/", line 1119, in _infer_shape_impl
  File "/home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/", line 146, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: Error in operator split0: [18:11:40] src/operator/./slice_channel-inl.h:208: Check failed: dshape[real_axis] % param_.num_outputs == 0U (31 vs. 0) You are trying to split the 1-th axis of input tensor with shape [50,78,256] into num_outputs=47 evenly sized chunks, but this is not possible because 47 does not evenly divide 78

Stack trace returned 10 entries:
[bt] (0) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/ [0x7f446310f938]
[bt] (1) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/ [0x7f446310fd48]
[bt] (2) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/ [0x7f4465705cb7]
[bt] (3) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/ [0x7f4465483b07]
[bt] (4) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/ [0x7f44652db74f]
[bt] (5) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/ [0x7f44652de268]
[bt] (6) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/mxnet/ [0x7f4465260659]
[bt] (7) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/../../ [0x7f4487cd1ec0]
[bt] (8) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/../../ [0x7f4487cd187d]
[bt] (9) /home/ubuntu/anaconda3/envs/mxnet_p36/lib/python3.6/lib-dynload/ [0x7f4487ee6dee]

Minimum reproducible example

I’m using the code from the following repository:

Everything is identical, except I use my own dataset. I’ve preprocessed the data identical to what this implementation expects (originally used the Flickr8k dataset).

What have you tried to solve it?

I basically need some help trying to understand where this error is coming from – in particular, why param_.num_output is set to 47.

  1. The error message is thrown here:

  2. num_outputs seems to be set here: … although this happens after self.forward is called, but the error message seems to be thrown before num_outputs is set


Hi @pn-train,

Given this error happens during the scoring of the model, I’d double check your evaluation data if I were you.
As a simple test, you could calling fit with the eval_data set to your training data and see if that runs (obviously ignoring the metrics returned!). If it does, confirm that you’re applying the same pre-processing steps to your evaluation data. If it doesn’t, could you provide a few more details on the processing of your data, including some samples ideally.