Inconsistent predictions with Dropout and repeating predictions

I’ve trained a simple net for matrix factorization with dropout on the input layer. Essentially row ids and column ids are fed through embedding layers, concatenated, and hit with dropout. I set the mode to “training” on the dropout layer which I understood to mean that dropout would only occur during training. However, when I attempt to get predictions (binding with training=False) I find inconsistent results that seem to indicate dropout is still applied.

Moreover, while the network is simple the data is huge - over 1.5 billion rows. For my evaluation I’m trying to store predictions in tandem with the input features (row, column ids). Perhaps because of batch size padding I’m finding that a number of rows appear multiple times throughout my predictions - and since dropout is applied, I get unique predictions - hardly a desirable outcome for a production system. (I suspect there would be more tutorials on deploying mxnet models to production systems if the documentation was more clear and coherent).

Can someone tell what I’m doing wrong with predictions?

import numpy as np
import mxnet as mx
import pandas
import os

matrixPath = “somePath”
modelPrefix = “someModelPrefix”
filestem = “someOutputPath”
epoch = 10
batch_size = 1000000

if not os.path.exists(filestem):
os.makedirs(filestem)

test_iter = mx.io.CSVIter(data_csv=matrixPath, data_shape=(2,), batch_size=batch_size)
batchidx = 0
for batch in test_iter:
batchidx += 1
fileName = filestem + “predictions_batch_” + str(batchidx) + “.csv”
f=open(fileName,‘ab’)
data = batch.data[0]
label = batch.label[0]
dataiter = mx.io.NDArrayIter(data, label, data.shape[0], False, last_batch_handle=‘pad’)
model = mx.module.Module.load(prefix=modelPrefix, epoch=epoch)
model.bind(data_shapes=dataiter.provide_data,label_shapes=dataiter.provide_label, for_training=False)
preds = model.predict(dataiter)
result = mx.nd.concat(data, preds)
np.savetxt(fileName,result.asnumpy(),delimiter=’,’)
f.close()

Can you try to load your model in Gluon and see if you get the same problem, for example with latest master (install using pip install mxnet --pre) using:

# Load the symbolic model
net = gluon.nn.SymbolBlock.imports('someModelPrefix-symbol.json', ['data'], 'someModelPrefix-0010.params')

You confirm that when running twice the same datapoint you get two different results?