The only way to customize the L2 regularization of an optimizer (e.g. if one wanted to amplify regularization of the embedding layer only) is by calling set_wd_mult on the optimizer. However set_wd_mult is not exposed through the Module API and wd_mult cannot be set through optimizer parameters passed into Module.fit() call. As a result, the only way to customize L2 regularization appears to require creating and initializing the optimizer outside of the module and passing it into fit() call. A simple modification to the optimizer’s constructor can allow wd_mult to be passed into the call. Any thoughts on this?