Inconsistent predictions with Dropout and repeating predictions

dscantle · July 5, 2018, 3:58pm

I’ve trained a simple net for matrix factorization with dropout on the input layer. Essentially row ids and column ids are fed through embedding layers, concatenated, and hit with dropout. I set the mode to “training” on the dropout layer which I understood to mean that dropout would only occur during training. However, when I attempt to get predictions (binding with training=False) I find inconsistent results that seem to indicate dropout is still applied.

Moreover, while the network is simple the data is huge - over 1.5 billion rows. For my evaluation I’m trying to store predictions in tandem with the input features (row, column ids). Perhaps because of batch size padding I’m finding that a number of rows appear multiple times throughout my predictions - and since dropout is applied, I get unique predictions - hardly a desirable outcome for a production system. (I suspect there would be more tutorials on deploying mxnet models to production systems if the documentation was more clear and coherent).

Can someone tell what I’m doing wrong with predictions?

import numpy as np
import mxnet as mx
import pandas
import os

matrixPath = “somePath”
modelPrefix = “someModelPrefix”
filestem = “someOutputPath”
epoch = 10
batch_size = 1000000

if not os.path.exists(filestem):
os.makedirs(filestem)

test_iter = mx.io.CSVIter(data_csv=matrixPath, data_shape=(2,), batch_size=batch_size)
batchidx = 0
for batch in test_iter:
batchidx += 1
fileName = filestem + “predictions_batch_” + str(batchidx) + “.csv”
f=open(fileName,‘ab’)
data = batch.data[0]
label = batch.label[0]
dataiter = mx.io.NDArrayIter(data, label, data.shape[0], False, last_batch_handle=‘pad’)
model = mx.module.Module.load(prefix=modelPrefix, epoch=epoch)
model.bind(data_shapes=dataiter.provide_data,label_shapes=dataiter.provide_label, for_training=False)
preds = model.predict(dataiter)
result = mx.nd.concat(data, preds)
np.savetxt(fileName,result.asnumpy(),delimiter=’,’)
f.close()

ThomasDelteil · July 6, 2018, 8:34pm

Can you try to load your model in Gluon and see if you get the same problem, for example with latest master (install using pip install mxnet --pre) using:

# Load the symbolic model
net = gluon.nn.SymbolBlock.imports('someModelPrefix-symbol.json', ['data'], 'someModelPrefix-0010.params')

You confirm that when running twice the same datapoint you get two different results?

Topic		Replies	Views
Beginner question: How to predict from Matrix Factorization model	12	2499	January 16, 2018
Reproduce results with different MXNET versions? Discussion	3	502	August 21, 2018
MXNet 1.2 accuracy drop	2	416	July 10, 2018
Bad prediction accuracy for FeedForward classifier in R Discussion	3	542	December 15, 2019
Dropout Discussion D2L Book	3	1128	April 3, 2023

Inconsistent predictions with Dropout and repeating predictions

Related Topics