Reproduce results with different MXNET versions?

danithaca · August 21, 2018, 2:46pm

Hi,

I’m trying to reproduce results from a model trained with mxnet 0.10. I’m using the exact same hyperparameters and training/validation data, but retrain with mxnet 1.2.1. Now I get different results and can’t reproduce the exact same model.

Question:

1.Shoud I expect models might be different when trained with different mxnet versions?
2. If yes, should I expect models trained with newer mxnet version would have the same or better quality in general (with the same hyperparams)? Otherwise, I might need to have “mxnet version” as a hyperparam to tune.

Thanks.
-Daniel

ThomasDelteil · August 21, 2018, 5:53pm

With each new version we fix a certain number of bugs. So you can expect the quality to increase overall, though with new features it is possible that some regression got in as well, despite the unit and integration tests.

However with deep learning, bugs are sometimes features and the network learns around these bugs. What that means in practice is that you might need to train your network with different hyper-parameters to reach the same accuracy after a given bug is fixed.

The other thing to take into account is the initialization of your network. Deep neural networks are extremely sensitive to initialization and you can run two trainings with different initialization and get completely different final accuracies. Some family of networks like GANs are especially sensitive to that.

Also, what do you mean by “I can’t reproduce the exact same model” ? If you mean accuracy, see comments above. If you mean the exact same model, as exact same final weights, because of the stochastic nature of the training, it is very unlikely that you would end up with two exact version of the network after two runs.

danithaca · August 21, 2018, 6:00pm

Thanks for the feedback.

We set the same random seed for python, numpy and mxnet so theoretically each training results should be deterministic? I did try retrain the model a few times with the exact same settings (hyperparams, data, random seed, # of epochs, etc), and each time I can reproduce the same results if I use the exact same version of software. That’s why we thought the reason that we couldn’t reproduce the exact same model (network weights) was due to mxnet version difference, because all the other settings was exactly the same.

ThomasDelteil · August 21, 2018, 6:19pm

Yes, operators, optimizers, are updated to fix existing bugs, numerical instability etc. You can expect the final values of your weights to change across versions, especially major ones.

Topic		Replies	Views
Updating mxnet from 1.0.0, networks give different outputs Discussion python , theory , general-question	4	513	March 13, 2019
What happen when I group two same model together and train it? Discussion	1	306	August 14, 2018
Mxnet-tensorrt result different Discussion	5	752	November 6, 2018
MXNet 1.2 accuracy drop	2	408	July 10, 2018
Inconsistent predictions with Dropout and repeating predictions Discussion	1	396	July 6, 2018

Reproduce results with different MXNET versions?

Related Topics