Hey all, as the title indicates I think there’s a bug (or it’s just weird behavior) in how ParameterDict handles the additional ParameterDict’s parameters that it can be initialized with.
The Gluon Block’s collect_param method does not surface the parameters of “shared” ParameterDicts and we need this functionality for some other work.
I think this is a bug in the implementation of the ParameterDict class’s getitem functionality, but I definitely could be missing something. The
get method looks in the shared dict, but the
___getitem___ method doesn’t, and the
___getitem___ is what the Gluon Block’s
collect_params method uses. I don’t need it to add items to the shared dict, just for them to show up when I loop through the overall ParameterDict.
Minimum reproducible example
from mxnet.gluon import ParameterDict outer_params = ParameterDict() outer_params.get("a", shape=(2,2)) outer_params.initialize() from mxnet.gluon import HybridBlock class ExampleBlock(HybridBlock): def __init__(self, param_dict, prefix=''): super(ExampleBlock, self).__init__(prefix=prefix, params=param_dict) eb = ExampleBlock(outer_params) eb.collect_params()
Running this, I would expect it to output the
a parameter, but instead it outputs just an empty parameter dictionary because collect_params doesn’t look inside the shared param_dict.
What have you tried to solve it?
To unblock temporarily, we’re just using a workaround in our code and just manually overriding the
_params of the Block with the correct ParameterDict, but that’s not a real option even in the short term. The obvious solution I see would be to make
__getitem__ also loop over the shared ParameterDict’s item’s but I don’t know what issues that might cause elsewhere.
P.S. Related issue I filed here.