This does what I want if call hybridize() before training. However, I am unable to modify this to get a HybridBlock that can run in both symbolic and imperative mode. (I also have one that can do only imperative mode by using the ndarray API, but thats also not what I want). If I just run the code above without hybridize, the result is mxnet.base.MXNetError: [21:02:32] ../src/imperative/./imperative_utils.h:122: Check failed: infershape[attrs.op](attrs, &in_shapes, &out_shapes)
Now if I edit the call to add the shape parameter to the random uniform call: F.random_uniform(shape=x.shape))
And try to run it on the gpu, I get an error in the add function: mxnet.base.MXNetError: [21:11:58] ../src/imperative/./imperative_utils.h:70: Check failed: inputs[i]->ctx().dev_mask() == ctx.dev_mask() (1 vs. 2) Operator broadcast_add require all inputs live on the same context. But the first argument is on gpu(0) while the 2-th argument is on cpu(0)
Which really astonishes me. I can not just call __add__ with an hardcoded scalar? How else would I do that, as I cant hardcode the creation of an nd.array or of an symbol there, as I would lose the hybride property again
But I guess the first error (infer shape) is more relevant here, as I cannot do x.shape with symbols anyway.
I actually know the exact shape of the noise that I need, but passing a tuple for the shape argument results in Deferred initialization failed because shape cannot be inferred. if I hybridize and again the different context error in the plus operator if I do not hybridize.
I am puzzled why mxnet can infer the shape with symbols, but not if I use ndarrays. Can I somehow specify shapes (similiar to custom operator property class) for hybrid blocks? Or is there something else wrong with the above code for imperative mode?
Please let me know if I did not provide necessary information like full stack traces. Any help is much appreciated.
In the line where you call noise = F.random_uniform(), noise is always a single random value, which is definitely not what you want for proper random sampling
In the symbolic case, it appears that a shape is inferred, but in fact what happens is when you call x + noise, noise is broadcast to shape of x.
The following simple changes will fix your problem:
Thanks for your reply. It makes sense to me, and I was already looking for uniform_like before. But I cannot find it, and your code throws the expected error: AttributeError: module 'mxnet.symbol.random' has no attribute 'uniform_like'
It is really weird that this function is mentioned in the documentation: Random Distribution Generator Symbol API — mxnet documentation
but as far as I can see, it is not implemented! https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/symbol/random.py I am on mxnet 1.3.0, but looking at the code on github I don’t think upgrading would solve my issue.
I know the shapes in my case, so I made a compromise to solve my problem. I now have the same noise for every image in the batch, which I think is fine for my case. I still couldnt find a way to generate the noise in respect to different batch sizes.
I work on a bigger project where mxnet is built from source. Updating it might be possible, but I think its not needed for this usecase.
My batchsize varies, and I need 2 random samples for each image (last 2 axis always fixed in this part of the network). So with my current implementation as far as I understand it, I have the same noise (the same 2 random variables) for each image in my batch, instead of drawing independently for each image. It’s not optimal but it should be ok. If I realize that it hinders the training process, I will update to 1.3.1 and use uniform_like to get random variables for each image in the batch.