Ensuring the same result with dropout using random seed?


#1

During the hyper-parameter tuning, it is often useful to ensure that your model produces the same result with the same set of hyper-parameters. I usually set the random seed for this:

mx.random.seed(1)
random.seed(1)

This works fine for most cases but when I apply dropout, it starts producing different result:

class LRModel(gluon.Block):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.dropout = gluon.nn.Dropout(0.2)
        self.out = gluon.nn.Dense(10)

    def forward(self, x):
        x = self.dropout(x)
        x = self.out(x)
        return x

Is there a way of applying seeded randomness to dropout? I also have a similar issue with the 2D-convoluion as well. Thanks!


#2

I’m not an expert. Just a random idea, have you tried to reset your data iterator?


#3

@robert Resetting the data iterator doesn’t work for this case; thanks for the suggestion though.


#4

https://mxnet.incubator.apache.org/api/python/ndarray.html#mxnet.random.seed

Doc says mx.random.seed will affect dropout randomness. is it right?


#5

According to the description, by setting up the random seed, we should expect to get the same result on the same machine using dropout but that’s not what I see. When I train with a simple logistic regression, it gives the same result but as soon as I use dropout, even after setting the random seed, it starts producing different results every time I train. I also see this with Con2D as well so I’m wondering if the random seed is not working properly with those operations.


#6

I think, it needs to report on issue.


#7

just talked to @mli. The seed is only used during the weight initialization and not used by dropout. Another limitation is that the seed is not honored by when you are training on GPUs because CDNN still returns random result.

@mli can add more behind the reasoning.


#8

@jdchoi77. Can you expand on your use-case a bit more ? Are you observing this while running on GPUs ?


#9

@jdchoi77, The behavior of random generators have been fixed as a part of https://github.com/apache/incubator-mxnet/pull/9119. This will honor the seed in the operators as well.

I tested with the latest MXNet(from master branch) and the seed is working for CPU. MKL version of MXNet is also being worked to get a fix.

I haven’t had the opportunity to test on GPU instance, If you have a setup already please try with the latest version of MXNet. I will update here as soon i test on a GPU.


#10

I verified deterministic dropout output for CPU and GPU with MKL enabled. Is that what needed checking?

Script:

import mxnet as mx
import mxnet.ndarray as nd
import mxnet.test_utils

ctx = mx.gpu(0)

mxnet.test_utils.set_default_context(ctx)

print(mx.version)

mx.random.seed(2)

A = nd.arange(20).reshape((5,4))

print(A)

a = mx.symbol.Variable(‘a’)
dropout = mx.symbol.Dropout(a, p=0.5)
executor = dropout.simple_bind(ctx, a=A.shape)

executor.forward(is_train=True, a=A)
print(executor.outputs[0])

executor.forward(is_train=True, a=A)
print(executor.outputs[0])

executor.forward(is_train=True, a=A)
print(executor.outputs[0])


#11

PR: https://github.com/apache/incubator-mxnet/pull/9366


#12

Thank you @nswamy and @cjolivier01 for making changes and comments. At the moment, I’m dealing with some other issues on AWS, but I will come back to this thread later this week or earlier next week for the feedback. Happy new year to everyone.