I’m pretty sure I’ve figured it out now. I’m working with scipy.sparse.csr_matrix inputs that are converted to nd.sparse.csr_matrix for training and prediction. I was explicitly setting the dtype to np.float32 while setting up the training data, but was not doing it for the prediction data. It’s not obvious to me why prediction would work, but asscalar() would fail. Perhaps it is the asynchronicity that @Sergey mentioned above.

Here’s a sample that runs into this issue (here the nd.sparse.csr_matrix is generated from a dense np.array, though):

```
import numpy as np
np.random.seed(12345)
import mxnet as mx
mx.random.seed(12345)
from mxnet import ndarray as nd
# Set up some data
x = np.array([
[0, 0, 0],
[0, 0, 1],
[0, 1, 0],
[0, 1, 1],
[1, 0, 0],
[1, 0, 1],
[1, 1, 0],
[1, 1, 1]
])
y = np.array([
[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
[0, 1],
[1, 0],
[0, 1]
])
# Training data has the dtype set explicitly
train_x = nd.sparse.csr_matrix(x, dtype=np.float32, ctx=mx.cpu())
train_y = nd.sparse.csr_matrix(y, dtype=np.float32, ctx=mx.cpu())
# Set up the model
norm_init = mx.initializer.Normal(sigma=0.1)
bias_init = mx.initializer.Zero()
X = mx.sym.Variable('X', stype='csr')
Y = mx.sym.Variable('Y', stype='csr')
W1 = mx.symbol.Variable('W1', stype='row_sparse', shape=(3, 5), init=norm_init)
b1 = mx.symbol.Variable('b1', shape=5, init=bias_init)
f1 = mx.sym.broadcast_add(mx.sym.sparse.dot(X, W1), b1)
r1 = mx.sym.relu(f1)
d1 = mx.sym.Dropout(r1, p=0.5)
W2 = mx.symbol.Variable('W2', shape=(5, 2), init=norm_init)
b2 = mx.symbol.Variable('b2', shape=2, init=bias_init)
f2 = mx.sym.broadcast_add(mx.sym.sparse.dot(d1, W2), b2)
output = mx.sym.LogisticRegressionOutput(f2, label=Y)
# Create the module and train
train_iter = mx.io.NDArrayIter(train_x, train_y, batch_size=2, last_batch_handle='discard', data_name='X', label_name='Y')
mod = mx.mod.Module(symbol=output,
data_names=['X'],
label_names=['Y'],
context=mx.gpu()
)
mod.bind(data_shapes=train_iter.provide_data, label_shapes=train_iter.provide_label)
mod.init_params()
mod.init_optimizer(optimizer='sgd', optimizer_params={'learning_rate':0.5})
train_iter.reset()
mod.fit(train_iter,
eval_metric='loss',
num_epoch=1
)
# Set up the prediction data, without specifying the dtype. If specified, the rest works.
test_x = nd.sparse.csr_matrix(x, ctx=mx.cpu())
test_iter = mx.io.NDArrayIter(test_x, None, batch_size=2, last_batch_handle='discard', data_name='X', label_name='Y')
for (outputs, nbatch, batch) in mod.iter_predict(test_iter):
predicted = outputs[0]
print(predicted.context)
topk = predicted.topk(k=1, ret_typ='both')
for idx in range(predicted.shape[0]):
top_predictions = [(topk[1][idx, j].asscalar(), topk[0][idx, j].asscalar()) for j in range(1)]
print(top_predictions)
```