(Edited to emphasize that the issue is independent of MMS.)
I’m writing a simple NLP classifier using MXNet, and I’d like to be able to send an arbitrary number of vectors for prediction.
I created the model with Reshape layers having 0 as the leading dimension, in order to accommodate an unknown batch size.
However, at inference time, I have to bind the data_shapes like so, which is where it errors out:
mod = mx.mod.Module(symbol=sym, context=mx.gpu(0)) mod.bind(for_training=False, data_shapes=[('data', (0, 20))], label_shapes=[('softmax_label', (0, ))]) mod.set_params(arg_params, aux_params, allow_missing=True, allow_extra=True)
ValueError: Too many slices. Some splits are empty.
This works fine if I change the leading dimension to a non-zero value (and send in a correspondingly-sized tensor). But I don’t always know the size of the incoming inferencing request.
How do I accomplish serving an arbitrarily-sized prediction request?