I’ve trained a simple net for matrix factorization with dropout on the input layer. Essentially row ids and column ids are fed through embedding layers, concatenated, and hit with dropout. I set the mode to “training” on the dropout layer which I understood to mean that dropout would only occur during training. However, when I attempt to get predictions (binding with training=False) I find inconsistent results that seem to indicate dropout is still applied.
Moreover, while the network is simple the data is huge - over 1.5 billion rows. For my evaluation I’m trying to store predictions in tandem with the input features (row, column ids). Perhaps because of batch size padding I’m finding that a number of rows appear multiple times throughout my predictions - and since dropout is applied, I get unique predictions - hardly a desirable outcome for a production system. (I suspect there would be more tutorials on deploying mxnet models to production systems if the documentation was more clear and coherent).
Can someone tell what I’m doing wrong with predictions?
import numpy as np
import mxnet as mx
matrixPath = “somePath”
modelPrefix = “someModelPrefix”
filestem = “someOutputPath”
epoch = 10
batch_size = 1000000
if not os.path.exists(filestem):
test_iter = mx.io.CSVIter(data_csv=matrixPath, data_shape=(2,), batch_size=batch_size)
batchidx = 0
for batch in test_iter:
batchidx += 1
fileName = filestem + “predictions_batch_” + str(batchidx) + “.csv”
data = batch.data
label = batch.label
dataiter = mx.io.NDArrayIter(data, label, data.shape, False, last_batch_handle=‘pad’)
model = mx.module.Module.load(prefix=modelPrefix, epoch=epoch)
preds = model.predict(dataiter)
result = mx.nd.concat(data, preds)